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Preface 


The AT&T C++ Language System Selected Readings contains papers about the C++ language. The 
manual is part of a set of four documents that are supplied with your C++ Language System. The 
other documents are: 

■ the Release Notes, which describe the contents of this release, how to install it, and changes to the 
language 

■ the Product Reference Manual, which provides a complete definition of the C++ language sup¬ 
ported by the Release 2.0 C++ Language System 

■ the Library Manual, which describes the three C++ class libraries and tells you how to use them 

The seven chapters in this manual are based on technical memoranda by authors working with various 
aspects of the C++ language. These chapters cover features of the language provided by Release 2.0 of 
the translator. 

■ Chapter 1 lists the new features of C++ and describes each one briefly 

■ Chapter 2 is a tutorial showing you how to use the special features that C++ provides 

■ Chapter 3 is an overview of the language provided with Release 2.0 

■ Chapter 4 describes support for object-oriented programming in C++ 

■ Chapter 5 explains the new multiple inheritance feature and describes its use 

■ Chapter 6 explains the new type-safe linkage capabilities 

■ Chapter 7 explains levels of protection in C++ class definitions 

■ Appendix A contains the manual pages for the C++ Language System, including the CC, c++filt, 
and demangle commands 

To make the best use of the Selected Readings, you should be familiar with the C programming 
language and the C programming environment under the UNIX® operating system. Refer to Appen¬ 
dix B of the Release Notes for further sources of information about these topics. 
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The Evolution of C++: 1985 to 1989 


NOTE 


This chapter is taken directly from a paper by Bjarne Stroustrup. 


Abstract 

The C++ Programming Language describes C++ as defined and implemented in August 1985. This paper 
the growth of the language since then and clarifies a few points in die definition. It is 
emphasized that these language modifications are extensions; C++ has been and will remain a stable 
language suitable for long term software development. The main new features of C++ are: multiple 
inheritance, type-safe linkage, better resolution of overloaded functions, recursive definition of assign¬ 
ment and initialization, better facilities for user-defined memory management, abstract clas s e s, static 
member functions, const member functions, protected members, overloading of operator ->, and 
pointers to members. These features are provided in die 2 JO release of C++. 


Introduction 


As promised in The C++ Programming Language, C++ has been evolving to meet the needs of its users. 
This evolution has been guided by the experience of users of widely varying backgrounds working in 
a great range of application areas. The primary aim of the extensions has been to enhance C++ as a 
language for data abstraction and object-oriented programming in general and to enhance it as a tool 
for writing high-quality libraries of user-defined types in particular. By a high-quality library I mean a 
library that provides a concept to a user in the form of one or more classes that are convenient, safe, 
and efficient to use. In this context, safe means that a class provides a specific type-secure interface 
between the users of the library and its providers; efficient means that use of the class does not impose 
large overhead in run-time or space on the user compared with hand written C code. 

Portability of at least some C++ implementations is a key design goal. Consequently, extensions that 
would add significantly to the porting time or to die demands on resources for a C++ compiler have 
been avoided. This ideal of language evolution can be contrasted with plausible alternative directions 
such as making programming convenient 

■ at the expense of efficiency or structure; 
a for novices at the expense of generality; 

a in a specific application area by adding special purpose features to the language; 
a by adding language features to increase integration into a specific C++ environment 
For some ideas of where these ideas of language evolution might lead C++ see Chapter 4. 

A programming language is only one part of a programmer's world. Naturally, work is being done in 
many other fields (such as tools, environments, libraries, education and design methods) to make C++ 
programming more pleasant and effective. This paper, however, deals strictly with language and 
language implementation issues. 
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Overview 


This paper is a brief overview of new language features; it is not a manual or a tutorial. The reader is 
assumed to be familiar with the language as described in The C++ Programming Language and to have 
sufficient experience with C++ to recognize many of the problems that the features described here are 
designed to solve or alleviate. Most of the extensions take the form of removing restrictions on what 
can be expressed in C++. 

■ Access Control 

First some extensions to C++'s mechanisms for controlling access to class members are 
presented. Like all extensions described here, they reflect experience with the mechanisms they 
extend and the increased demands posed by foe use of C++ in relatively large and complicated 
projects. 

■ Overloading Resolution 

■ Type-Safe Linkage 

C++ software is increasingly constructed by combining semi-independent components (modules, 
classes, libraries, etc) and much of foe effort involved in writing C++ goes into the design and 
implementation of such components. To help these activities, foe rules for overloading function 
names and the rules for linking separately compiled code have been refined. 

■ Multiple Inheritance 

■ Base and Member Initialization 

■ Abstract Classes 

Classes are designed to r ep r esent general or application specific conce p t s . Originally, C++ pro¬ 
vided only single inheritance, that is, a class could have at most one direct base rfass, so that the 
directly representable relations between classes had to be a tree structure. This is sufficient in a 
large majority of cases. However, there are important concepts for which relations cannot be 
naturally expressed as a tree, but where a directed acyclic graph is suitable. As a consequence, 
C++ has been extended to support multiple inheritance, that is, a class can have several immedi¬ 
ate base classes, directly. The rules for ambiguity resolution and for initialization of base 
and members have been refined to cope with this extension. 

■ static Member Functions 
a const Member Functions 

■ Initialization of static Members 

■ Pointers to Members 

The concept of a class member has been generalized. Most important, the introduction of const 
member functions allows the rules for const class objects to be enforced. 

■ User-Defined Free Store Management 

The mechanisms for user-defined memory management have been refined and extended to the 
point where the old and inelegant "assignment to this" mechanism has become redundant. 

■ Assignment and Initialization 

The rules for assignment and initialization of class objects have been made more general and 
uniform to require less work from the programmer. 
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■ Operator -> 

■ Operator, 

■ Initialization of static objects 

■ Some minor extensions are presented. 

■ Resolutions 

The last section does not describe language extensions but presents the resolution of some details 
of the C++ language definition. 

■ In addition to the extensions mentioned here, many details of the definition of C++ have been 
modified for greater compatibility with the proposed ANSI C standard. 


Access Control 


The rules and syntax for controlling access to class members have been made more flexible, 

protected Members 

The simple private/public model of data hiding served C++ well where C++ was used essentially as a 
data abstraction language and for a large class of problems where inheritance was used for object- 
oriented programming. However, when derived classes are used there are two kinds of users of a 
dass: derived classes and "the general public" The members and friends that implement the opera¬ 
tions on the dass operate on the dass objects on behalf of these users. The private/public mechanism 
altows the programmer to distinguish dearly between the implementors and the general public, but 
does not provide a way of catering specifically to derived dasses. This often caused the data hiding 
mechanisms to be ignored: 6 

class X { 

// ... 

public: 

int a; 


// ... 

); 

Another symptom of this problem was overuse of friend declarations: 


// One bad way: 


// "a” should have been private 
// don't use it unless you are 
// a member of a derived class 
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class X { 

friend class Dl; 
friend class D2; 
If ... 

friend class Dn; 
U ... 
int a; 

public: 

U ... 

1 ; 


II Another bad way: 

// make derived classes friends 

// to give access to private member "a" 


The solution adopted was protected members. A protected member is accessible to members and 
friends of a derived class as if it were public, but inaccessible to “the general public" just like private 
members. For example: 


class X { 

// private by default: 

int priv; 
protected: 

int prot; 


public: 


); 


int publ; 


class Y : public X { 
void mf (); 


); 


Y: :mf () 

{ 

// error: priv is private 

// OK: prot is protected and mf2() is a member of Y 
// OK: publ is public 

} 



void 

{ 


f (Y* p) 

p->priv - 1; 
p->prot - 2; 


1 


p->publ - 3; 


// error: priv is private 

// error: prot is protected and f() is not a friend 
// or a member of X or Y 

// OK: publ is public 


A more realistic example of the use of protected can be found in this chapter under “Multiple Inheri¬ 
tance." 
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A friend function has the same access to protected members as a member function. 

A subtle point is that accessibility of protected members depends on the static type of the pointer used 
in the access. A member or a friend of a derived class has access only to protected members of objects 
that are known to be of its derived type. For example: 

class Z : public Y { 

// ... 

); 


void Y: :mf () 

{ 

prot - 2; // OK: prot is protected and m£() is a member function 

X a; 

a.prot - 3; // error: prot is protected and a is not a Y 

X* p ■ this; 

p->prot - 3; // error: prot is protected 

// and p is not a pointer to Y 


1 


Z b; 

b.prot • 4; // OK: prot is protected 

// and mf () is a member and a Z is a Y 


A protected member of a class base is a protected member of a class derived from base if the deriva¬ 
tion is public and private otherwise. 


Access Control Syntax 

The following example confuses most beginners and even experts get bitten sometimes: 
class X { 

// ... 


public: 

); 


int f (); 


class Y : X {/*...*/}; 


int g(Y* p) 

{ 

// ... 

return p->f 0 ; // error! 

1 ; 

Here X is by default declared to be a private base class of Y. This means that X is not a subtype of Y 
so the call p->K) is illegal because Y does not have a public function f(). Private base classes are quite 
an important concept, but to avoid confusion it is recommended that they be declared private 
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explicitly: 

class Y : private X { /* ... */ }; 


Several public, private, and protected sections are allowed in a class declaration: 


class X { 
public: 

private: 

public: 


int 11; 


int i2; 


int i3; 


); 


These sections can appear in any order. This implies that the public interface of a class may appear 
textually before the private “implementation details": 


class S { 
public: 

f(); 
int il; 
H ... 

private: 

go ; 

int i2; 
// .... 

); 


Adjusting Access 

When a class base is used as a private base class all of its members are considered private members of 
the derived class. The syntax base-dass-name s member-name can be used to restore access of a member 
to what it was in the base: 


class base { 
public: 

int 

protected: 

int 

private: 

int 

}; 


publ; 

prot; 

priv; 


( 
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class derived : private base { 
protected: 

base::prot; 

public: 


1 ; 


base::publ; 


// protected in derived 
// public in derived 


This mechanism cannot be used to grant access that was not already granted by the base da«f 

class derived2 : public base { 
public: 

base::priv; // error: base::priv is private 

}; 

This mechanism can be used only to restore access to what it was in the base 

class derived3: private base { 
protected: 

base::publ; // error: base::publ was public 

1 ; 

This mechanism cannot be used to remove access already granted: 

class derived4: public base { 
private: 

base::publ; // error: base::publ is public 

1; 


We considered allowing the last two forms and experimented with them, but found that they caused 
total confusion among users about the access control rules and the rules for private and public deriva¬ 
tion. Similar considerations led to the decision not to introduce the (otherwise perfectly reasonable) 
concept of protected base classes. 


Details 

A friend function has the same access to base class members as a member function. For example: 


class base ( 
protected: 


public: 


1 ; 


int prot; 
int pub; 


class derived : private base ( 
public: 

friend int fr (derived *p) { return p->prot; ) 
int memo { return prot; } 

1 ; 


In particular, a friend function can perform the conversion of a pointer to a derived Hacc to its private 
base class: 
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class derived2 : private base { 
public: 

friend base* fr (derived *p) { return p; } 
base* roem() { return this; ) 

}; 


base* f (derived* p) 
{ 

return p; 

) 


// error: cannot convert; 

// base is a private base class of derived 


However, friendship is not transitive. For example: 

class X { 
friend class Y; 
private: 

int a; 

1 ; 


class Y { 


); 


friend int 

{ 

int mem(Y* 

{ 


fr(Y *p) 
return p->a; 
P) 

return p->a; 


// error: fr() is not a friend of X 
// OK: mem() is a friend of X 


Overloading Resolution 

The C++ overloading mechanism was revised to allow resolution of types that used to be "too similar" 
and to gain independence of declaration order. The resulting scheme is more expressive and catches 
more ambiguity errors. Consider 

double abs(double); 
float abs(float); 

To cope with single precision floating point arithmetic it must be possible to declare both of these 
functions; now it is. The effect of any call of abs() given the declarations above is the same if the order 
of declarations was reversed: 

float abs(float); 
double abs(double); 


Here is a slightly simplified explanation of the new rules. Note that with the exception of a few cases 
where the the older rules allowed order dependence the new rules are compatible and old programs 
produce identical results under the new rules. For the last two years or so C++ implementations have 
issued warnings for the now "outlawed" order dependent resolutions. 
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C++ distinguishes five kinds of "matches": 

■ Match using no or only unavoidable conversions (for example, array name to pointer, function 
name to pointer to function, and T to const T). 

■ Match using integral promotions (as defined in the proposed ANSI C standard; that is, ch^ r to 
int, short to int and their unsigned counterparts) and float to double. 

■ Match using standard conversions (for example, int to double, derived* to base*, unsigned int 
to int). 

■ Match using user defined conversions (both constructors and conversion operators). 

■ Match using the ellipsis — in a function declaration. 


Consider first functions of a single argument The idea is always to choose the "best" match, that is 
the one highest on the list above. If there are two best matches the call is ambiguous and thus a com¬ 
pile time error. For example, 

float abs(float); 
double abs(double); 
int abs(int); 
unsigned abs(unsigned); 
char abs(char); 


abs(1); 
abs(1U); 
abs(1.0); 
abs(l.OF); 
abs('a'); 
abs(1L); 


// abs(int); 

// abs(unsigned); 

// abs(double); 

// abs(float); 

// abs(char); 

// error: ambiguous, abs(int) or abs(double) 


Here, the calls take advantage of the ANSI C notation for unsigned and float literals and of the C++ 
rule that a character constant is of type char*. The call with the long argument 1L is ambiguous since 
abs(int) and abs(double) would be equally good matches (match with standard conversion). 

Hierarchies established by public class derivations are taken into account in function matching and 
where a standard conversion is needed the conversion to the "most derived" cla ss is chosen. A void* 
argument is chosen only if no other pointer argument matches. For example: 
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class B { /* — */ }; 

class BB : public B {/*...*/}; 

class BBB : public BB { /* ... */ ); • 


f (B*); 
f (BB*); 
f (void*); 


void g(BBB* pbbb, int* pi) 

{ 

f(pbbb); // f(BB*); 

£ (pi) ; // f (void*) ; 

) 

This ambiguity resolution rule matches the rule for virtual function calls where the member from the 
most derived class is chosen. 

If two otherwise equally good matches differ in terms of const, the const specifier is taken into account 
in function matching for pointer and reference arguments. For example: 

char* strtok(char*, const char*); 

const char* strtok(const char*, const char*); 

void 9 (char* vc, const c har * vcc) 

{ 

char * pi - strtok (vc, "a*);// strtok (char*, char*) ; 

coa st char* p 2 - strtok(vcc,"a ")ill strtok(const char*, char*); 

char* p 3 - strtok (vcc, "a");// error 

> 

In the third case, strtok(const char*, const char*) is chosen because vcc is a const char*. This leads to 
an attempt to initialize the char* p3 with the const char* result. 

For /’aiig involving more than one argument a function is chosen provided it has a better match than 
every other function for at least one argument and at least as good a match as every other function for 
every argument For example: 
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class conplex { ... complex (double); ) ; 

f (int, double) ; 
f (double, int); 
f (conplex, int) ; 
f(int ...) ; 
f(complex ...); 


// £(int,double); 

// f(double,int); 

// f (conplex, int) ; 

// f(conplex ...); 

// f(int ...); 

// error: ambiguous, £(int,double) and £(double,int) 

The unfortunate narrowing from double to int in the third and the second to last cases causes warn¬ 
ings. Such narrowings are allowed to preserve compatibility with C. In this particular case the nar¬ 
rowing is harmless, but in many cases double to int conversions are value destroying and they should 
never be used thoughtlessly. 3 6 y a,ujuiu 

As ever, at most one user-defined and one built-in conversion may be applied to a single argument. 


conplex z - 1; 

£U, 2.0); 
f(1.0, 2); 
f(2, 1.2); 

£U. 1, 3); 
£(2.0, z); 
f(l, 1); 


Type-Safe Linkage 


Originally, C++ allowed a name to be used for more than one name (“to be overloaded") only after 
explicit overload declaration. For example: 3 


an 


overload max; 

int max(int,int); 

double max (double, double); 


// 'overload' now obsolete 


It used to be considered too dangerous simply to use a name for two functions without previous 
declaration of intent. For example: K 

int abs(int); 

double abs(double); // used to be an error 

This fear of overloading had two sources: 

■ concern that undetected ambiguities could occur 

■ concern that a program could not be properly linked unless the programmer explicitly declared 

where overloading was to take place. J 

The former fear proved largely groundless and the few problems found in actual use have been taken 
are of by the new order-independent overloading resolution rules. The latter fear proved to have a 
basis in a general problem with C separate compilation rules that had nothing to do with overloading 
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On the other hand, the redundant overload declarations themselves became an increasingly serious 
problem. Since they had to precede (or be part of) the declarations they were to enable, it was not 
possible to merge pieces of software using the same function name for different functions unless both 
pieces had declared the function overloaded. This is not usually the case. In particular, the name one 
wants to overload is often the name of a C standard library function declared in a C header. For 
example, one might have standard headers like this: 

/* Header for C standard math library, math.h: V 
double sqrt(double); 

/* ... V 


// header for C++ standard complex arithmetic library, complex.h: 
overload sqrt; 
canplex sqrt (canplex); 

// ... 

and try to use them like this: 

♦include <math.h> 

♦include <canplex.h> 

This causes a compile time error when the overload for sqrtO is seen after the first declaration of 
sqrt(). Rearranging declarations, putting constraints on the use of header files, and sprinkling over¬ 
load declarations everywhere "just in case" can alleviate this kind of problem, but we found the use of 
sudt tricks unmanageable in all but the simplest cases. Abolishing overload declarations (and getting 
rid of the overload keyword in the process) is a much better idea. 

Doing things this way does pose an implementation problem, though. When a single name is used for 
several functions, one must be able to tell the linker which calls are to be linked to which function 
definitions. Ordinary linkers are not equipped to handle several functions with the same name. How¬ 
ever, they can be tricked into handling overloaded names by encoding type information into the 
names seen by the linker. For example, the names for these two functions: 

double sqrt (double) ; 
complex sqrt (complex) ; 

become: 


sqrt Fd 
sqrt F7coBplex 

in the compiler output to the linker. The user and the compiler see the C++ source text where the 
type information serves to disambiguate and the linker sees the names that have been disambiguated 
by adding a textual representation of the type information. Naturally, one might have a linker that 
understood about type information, but it is not necessary and such linkers are certainly not common. 

Using this encoding or any equivalent scheme solves a long standing problem with C linkage. Incon¬ 
sistent function declarations in separately compiled code fragments are now caught. For example: 
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// filel.c: 

eastern name* lookup (table* tbl, const char * name ? • 

// ... 

void some_f ct (char* s) 

{ 

name* n - lookup(gtbl,s); 

) 

looks plausible and the compiler can find no fault with it However, if the definition of lookupO turns 
out to be: r 

// file2.c: 

int lookup (table* tbl, const char* name, int ixrdex) 

U ... 

) 

the linker now has enough information to catch the error. 

Finally, we have to face the problem of linking to code fragments written in other languages that do 
not know the C++ type system or use the C++ type encoding scheme. One could imagine all com- 
pilers for all languages on a system agreeing on a type system and a linkage scheme such that linkage 
of code fragments written in different languages would be safe. However, since this will not typically 
be the case we need a way of calling functions written in a language that does not use a type-safe link¬ 
age scheme and a way to write C++ functions that obey the different (and typically unsafe) linkage 
rules for other languages. This is done by explicitly specifying the name of the desired linkage con¬ 
vention in a declaration: 

extern "C" double sqrt (double); 

or by enclosing whole groups of declarations in a linkage directive: 

extern "C" { 

♦include <math.h> 

> 

B y applying the second form of linkage directive to standard header files one can avoid tittering the 
user code with linkage directives. This type-safe linkage mechanism is discussed in detail in Chapter 
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Multiple Inheritance 


Consider writing a simulation of a network of computers. Each node in the network is represented by 
an object of class Switch, each user or computer by an object of class Terminal, and each communica¬ 
tion line by an object of class Line. One way to monitor the simulation (or a real network of the same 
structure) would be to display the state of objects of various classes on a screen. Each object to be 
displayed is represented as an object of class Displayed. Objects of class Displayed are under control 
of a display manager that ensures regular update of a screen and/or data base. The classes Terminal 
and Switch are derived from a class Task that provides the basic facilities for co-routine style 
behavior. Objects of class Task are under control of a task manager (scheduler) that manages the real 
processor^). 

Ideally Task and Displayed are classes from a standard library. If you want to display a terminal, 
dass Terminal must be derived from class Displayed. Class Terminal, however, is already derived 
from dass Task. In a single inheritance language, such as Simula67, we have only two ways of solv¬ 
ing this problem: deriving Task from Displayed or deriving Displayed from Task. Neither is ideal 
since they both create a dependency between the library versions of two fundamental and independent 
concepts. Ideally, one would want to be able to say that a Terminal is a Task and a Displayed; that a 
Line is a Displayed but not a Task; and that a Switch is a Task but not a Displayed. 

The ability to express this dass hierarchy, that is, to derive a dass from more than one base dass, is 
usually referred to as multiple inheritance. Other examples involve the representation of various kinds 
of windows in a window system and the representation of various kinds of processors and compilers 
for a multi-machine, multi-environment debugger. 

In general, multiple inheritance allows a user to combine concepts represented as classes into a compo¬ 
site con ce pt rep r e s e nted as a derived dass. C++ allows this to be done in a general, type-safe, com¬ 
p act, and effident manner. The basic scheme allows independent concepts to be combined and ambi¬ 
guities to be detected at compile time. An extension of the base class concept, called virtual base 
classes , allows dependences between classes in an inheritance DAG (Directed Acyclic Graph) to be 
expressed. 

Ambiguous uses are detected at compile time: 

class A { f(); /* ... */ 1; 

class B { f (); /* ... V 1 : 

class C : public A, public B ( ); 


void g() { 

C* p; 

p->£(); // error: ambiguous 

1 

Note that it is not an error to combine classes containing the same member names in an inheritance 
DAG. The error occurs only when a name is used in an ambiguous way — and only then does the 
compiler have to reject the program. This is important since most potential ambiguities in a program 
never appear as actual ambiguities. Considering a potential ambiguity an error would be far too res¬ 
trictive^ 
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Typically one would resolve the ambiguity by adding a function: 

class C : public A, public B { 
public: 

£0 

{ 

// C's own stuff 
A: :f (); 

B::f(); 

1 

II ... 

1; 


This example shows the usefulness of naming members of a base class explicitly with the name of the 
base class. In the restricted case of single inheritance, this way is marginally less elegant than the 
approach taken by Smalltalk and other languages (simply referring to "my super class" instead of 
using an explicit name). However, the C++ approach extends cleanly to multiple inheritance. 

A class can appear more than once in an inheritance DAG: 

class A : public L { I* — *1 }; 

class B : public L { /* ... */ }; 

class C : public A, public B { /* ... *1 }; 


In this case, an object of class C has two sub-objects of class L, namely AsL and B::L. This is often 
useful, as in the case of an implementation of lists requiring each element on a list to contain a link 
element. If in the example above L is a link class then a C can be on both the list of As and the list of 
Bs at the same time. 

Virtual functions work as expected; that is the version from the most derived class is used: 


class A { public: virtual f(); /* ... */ ); 
class B { public: virtual g(); /* ... *1 }; 
class C : public A, public B { public: f(); g(); /* — */ }; 


void ff() 

{ 

C obj; 

A* pa - Sob j; 
B* pb - sobj; 


pa->f (); 
pb->g() ; 


// calls C::f 
// calls C: :g 


This way of combining classes is ideal for representing the union of independent or nearly indepen¬ 
dent concepts. However, in some interesting cases, such as the window example, a more explicit way 
of expressing sharing and dependency is needed. 
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Virtual base classes provide a mechanism for sharing between sub-objects in an inheritance DAG and 
for expressing dependencies among such sub-objects: 

class A : public virtual W {/*...*/ j ; 
class B : public virtual W { /* ... */ ) ; 
class C : public A, public B, public virtual W { /* ... */ 

In this case there is only one object of w in Hat? c. 

Constructing the tables for virtual function calls can get quite complicated when virtual base clas ses 
are used. However, virtual functions work as usual by choosing the version from the most derived 
class in a call: 

class H { 

// ... 

public: 

virtual void £(); 
virtual void g(); 
virtual void h(); 
virtual void k(); 

// ... 

}; 


class AH : public virtual H { /* 
class BH : public virtual H { /* 
class CW : public AH, public BH, 

// ... 

public: 

void h(); 

If ... 

1; 


CW* pew - new CW; 

pcw->f (); // 

f>cw->g(); // 

pcw->h(); // 

<<AH*)pcw)->f<); // 


invokes BH: :£() 
invokes AH: :g() 
invokes CW: :h() 
invokes BH::f() !!! 


*/ 

V 


public: 

public: 


void g(); 
void f (); 


/* 

/* 


V 

V 


public virtual H { 


1 ; 

1 ; 


The reason that BWsfO is invoked in the last example is that the only f<) in an object of c lass CW is 
the one found in the (shared) sub-object W, and that one has been overridden by Buf(). 

Ambiguities are easily detected at the point where CW's table of virtual functions is constructed. The 
rule for detecting ambiguities in a class DAG is that all re-definitions of a virtual function from a vir¬ 
tual base class must occur on a single path through the DAG. The example above can be drawn like 
this: 
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Figure l-l: A Directed Acyclic Graph 


W{fghk} 


AW { g } BW { f } 

CW { h } 


Note that a call "up" through one path of the DAG to a virtual function may result in the call of a 
function (re-defined) in another path (as happened in the call ((AW*)pcw>- >f() in the example above). 
In this example, an ambiguity would occur if a function fO was added to AW. This ambiguity might 
be resolved by adding a function fO to CW. 

Programming with virtual bases is trickier than programming with non-virtual bases. The problem is 
to avoid multiple calls of a function in a virtual class when that is not desired. Here is a possible style: 

class W ( 

// ... 

protected: 

_f () { ray stuff } 

// ... 

public: 

f() ( _f<); 1 
// ... 

); 

Each class provides a protected function doing "its own stuff," _f(), for use by derived dasses and a 
public function f() as the interface for use by the "general public." 

class A : public virtual W { 

// ... 


protected: 


f () ( ny stuff ) 

// ... 

public: 

f 0 { f(); W:: f () ; 

// ... 




A derived class f() does its "own stuff" by calling _K) and its base classes' "own stuff" by calling their 
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class B : public virtual w { 

II ... 
protected: 

_f() { ny stuff } 

II ... 

public: 

fO 1 _f(); W::_f(); 1 

u ... 

I ; 

In particular, this style enables a class that is (indirectly) derived twice from a class W to call Wsf() 
once only: 

class C : public A, public B, public virtual W { 

H ... 
protected: 

f () {ay stuff ) 

7 / ... 

public: 

fO { _£(); A:: f (); B::_f(); 0; } 

U ... 

}; 


Method combination schemes, such as the ones found in Lisp systems with multiple inheritance, were 
considered as a way of reducing the amount of code a programmer needed to write in cases like the 
one above. However, none of these schemes appeared to be sufficiently simple, general, and efficient 
enough to warrant the complexity it would add to C++. 

As described in Chapter 5 a virtual function call is about as efficient as a normal function call — even 
in die case of multiple inheritance. The added cost is 5 to 6 memory re fe rence s per call. This com¬ 
pares with the 3 to 4 extra memory references incurred by a virtual function call in a C++ compiler 
providing only single inheritance. The multiple inheritance scheme currently used causes an increase 
of about 50% in the size of the tables used to implement the virtual functions compared with the older 
single inheritance implementation. To offset that, the multiple inheritance implementation optimizes 
away quite a few spurious tables generated by the older single-inheritance implementations so that the 
memory requirement of a program using virtual functions actually decreases in most cases. 

It would have been nice if there had been absolutely no added cost for the multiple inheritance scheme 
when only single inheritance is used. Such schemes exist, but involve the use of tricks that cannot be 
done by a C++ compiler generating C 

Base and Member Initialization 


The syntax for initializing base and members has been extended to cope with multiple inheri¬ 

tance and the order of initialization has been more precisely defined. Leaving the initialization order 
unspecified in the original definition of C++ gave an unnecessary degree of freedom to language 
implementors at tire expense of tire users. In most cases, the order of initialization of members doesn't 
matter and in most cases where it does matter, the order dependency is an indication of bad design. 

In a few easy?, however, the programmer absolutely needs control of the order of initialization. For 
example, consider transmitting objects between machines. An object must be re-constructed by a 
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receiver in exactly the reverse order in which it was decomposed for transmission by a sen der This 
cannot be guaranteed for objects communicated between programs compiled by compilers from dif¬ 
ferent suppliers unless the language specifies the order of construction. 

Consider 

class A { public: A(int); A(); /* ... */ ); 
class B { public: B(in£); B(); /* ... */ }; 


class C : public A, public B { 
const a; 
inti b; 

public: 

CUnti); 

); 

In a constructor the sub-objects representing base classes can be r efe rred to by their namy 

C::C(int£ rr) : A<1), B(2), a<3), b(rr) { /* ... */ } 

The initialization takes place in the order of declaration in the class with base classes initialized before 
members 3 , so the initialization order for class C is A, B, a, b. This order is independent of the order 
of explicit initializers so 


C::C(inti rr) : b(rr), B(2), a(3), A(l) ( /* ... */ > 
also initializes in the declaration order A, B, a, b. 

The reason for ignoring the order of initializers is to preserve the usual FIFO ordering of constructor 
and destructor calls. Allowing two constructors to use different orders of initialization of b ases and 
members would constrain implementations to use more dynamic and more expensive strategies. 

Using the base class name explicitly clarifies even die case of single inheritance without member ini¬ 
tialization: 

class vector { 

U ... 

public: 

vector (int); 

U ... 

); 


class vec : public vector { 

U ... 

public: 

vec (int, int); 

H ... 

1 ; 

It is reasonably dear even to novices what is going on here: 


Evolution of C++ 


1-19 







The Evolution of Cm-: 1985 to 1989 


vec: :vec(int low, int high) : vector (high-low-1) {/*...*/} 

On the other hand, this version: 

vec::vec(int low, int high) : (high—low—1) { /* ... */ } 

has caused much confusion over the years. The old-style base class initializer is of course still 
accepted. It can be used only in the single inheritance case since it is ambiguous otherwise. 

A virtual base is constructed before any of its derived classes. Virtual bases are constructed before any 
non- virtual bases and in the order they appear on a depth-first left-to-right traversal of the inheritance 
DAG. This rule applies recursively for virtual bases of virtual bases. 

A virtual base is initialized by the "most derived" class of which it is a base. For example: 
class V { public: V(); V(int); /* ... */ 1; 

rl»»a A : public virtual V ( public: AO; A (int); /* ... */ }; 

class B : public virtual V { public: B(); B(int); /* ... */ ); 

class C : public A, public B ( public: C(); C(int); /* ... */ ); 


A::A(int i) : V(i) ( /* ... V 1 
B::B(int i) (/*...*/) 

C: :C(int i) {/*...*/) 


V v(l);// use V(int) 
A a(2);// use V(int) 
B b(3);// use V() 

C c(4);// use V() 


The order of destructor calls is defined to be the reverse order of appearance in the class declaration 
(members before bases). There is no way for the programmer to control this order - except by the 
declaration order. A virtual base is destroyed after all of its derived classes. 

It might be worth mentioning that virtual destructors are (and always have been) allowed: 
struct B {/*... */ virtual -BO; ); 
struct D : B ( -D(); }; 


void g() ( 

B* p - new D; 

delete p;// D::-D() is called 


The word virtual was chosen for virtual base classes because of some rather vague conceptual similari¬ 
ties to virtual functions and to avoid introducing yet another keyword. 
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Abstract Classes 

One of the purposes of static type checking is to detect mista k es and inconsistencies before a program 
is run. It was noted that a significant class of detectable errors was escaping C++'s checking. To add 
insult to injury, the language actually forced programmers to write extra code and generate larger pro¬ 
grams to make this happen. 

Consider the classic "shape" example. Here, we must first declare a class shape to represent foe gen¬ 
eral concept of a shape. This class needs two virtual functions rotated and drawO. Naturally, there 
can be no objects of class shape, only objects of specific shapes. Unfortunately C++ did not provide a 
way of expre ss ing this simple notion. 

The C++ rules specify that virtual functions, such as rotateO and dxawO, must be defined in the 
in which they are first declared. The reason for this requirement is to ensure that traditional linkers 
can be used to link C++ programs and to ensure that it is not possible to call a virtual function that 
has not been defined. So foe programmer writes something like this: 

class shape { 

point center; 
color col; 

// ... 

public: 

where() { return oenter; } 

move(point p) { center-p; draw(); } 

virtual void rotate(int) { error("cannot rotate"); abort(); } 
virtual void draw() { error("cannot draw"); abort(); ) 

// ... 

); 

This ensures that innocent errors such as forgetting to define a dxawO function for a class derived from 
shape and silly errors such as creating a "plain" shape and attempting to use it cause run time errors. 
Even when such errors are not made, memory can easily get duttered with unnecessary virtual tables 
for classes such as shape and with functions that are never called, such as drawO and rotateO. The 
overhead for this can be noticeable. 

The solution is simply to allow foe user to say that a virtual function does not have a definition; that 
is, that it is a "pure virtual function." This is done by an initializer =0: 

class shape ( 

point oenter; 
color col; 

// ... 

public: 

where() { return c en t e r; ) 

move(point p) { center-point; draw(); ) 

virtual void rotate (int) * 0;// pure virtual function 

virtual void drawO “ 0;// pure virtual function 

// ... 

1 ; 

A class with one or more pure virtual functions is an abstract class. An abstract class can only be used 
as a base for another class. In particular, it is not possible to create objects of an abstract dass. A class 
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derived from an abstract class must either define the pure virtual functions from its base or again 
declare them to be pure virtual functions. 

The notion of pure virtual functions was chosen over the idea of explicitly dedaring a dass to be 
abstract because the selective definition of functions is much more flexible. 


Static Member Functions 

A static data member of a dass is a member for which there is only one copy rather than one per 
object and which can be accessed without referring to any particular object of the dass it is a member 
of. The reason for using static members is to reduce the number of global names, to make obvious 
which static objects logically belong to which dass, and to be able to apply access control to their 
names. This is a boon for library providers since it avoids polluting foe global name space and 
thereby allows easier writing of library code and safer use of multiple libraries. These reasons apply 
for functions as well as for objects. In fact, most of foe names a library provider wants local are func¬ 
tion names. It was also observed that nonportable code, such as 


<(X*)<»->f<); 


was ycaH to oTr»niaK» static member functions. This trick is a time bomb because sooner or later some¬ 
one %rill make an f() that is used this way virtual and the call will fail horribly because there is no X 
o bj e ct at address zero. Even where K) is not virtual such calls will fail under some implementations of 
dynamic linking. 

A static member function is a member so that its name is in the dass scope and the usual access con¬ 
trol rules apply. A static member function is not associated with any particular object and need not be 
called using the special irember function syntax For example: 


class X { 

int men; 

public: 

static void f (int,X*) ; 


); 


void g() 

{ 

X obj; 

f (l,4obj); // error (unless there really is 

// a global function f(>) 

X: :f (1, cobj); // fine 

obj.f(1,sobj); // also fine 


Since a static member function isn't called for a particular object it has no this pointer and cannot 
access members without explicitly specifying an object For example: 
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void X::f(int i, X* p) 

{ 

mem - i; // error: which stem? 

p-xnem - i; // fine 

) 


const Member Functions 


Consider this example: 
class s { 

int aa; 

public: 

void mutate() { aa++; ) 
int valued { return aa; > 

}; 


void g() 

{ 

s ol; 

const s o2; 
ol .nutated; 

o2. mutated; 

int i - ol.valued + o2.valued; 

} 

It seems dear that the call oZmutateO ought to fail since o2 is a const. 

The reason this rule until now has not been enforced is simply that there was no way of distinguishing 
a member function that may be invoked on a const object from one that may not In general, foe com¬ 
piler cannot deduce which functions will change foe value of an object For example, had mutateO 
been defined in a separately compiled source file foe compiler would not have been able to detect the 
problem at compile time. 

The solution to this has two parts. First const is enforced so that "ordinary" member functions cannot 
be called for a const object Then we introduce the notion of a const member function, that is, a 
member function that may be called for all objects induding const objects. For example: 

class X { 

int aa; 

public: 

void mutated { aa++; } 

int valued const { return aa; > 

1 ; 


Now XrvalueO is guaranteed not to change the value of an object and can be used on a const object 
whereas Xcmutated can only be called for non-c const objects: 
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int g() 

{ 

X ol; 

const X o2; 

ol.nutate(); // fine 

o2.nutate(); //error 

return ol.valueO + o2.value(); // fine 

> 


In a const irember function of X the this pointer points to a const X. This ensures that non-devious 
attempts to modify the value of an object through a const member will be caught 

class X { 

int a; 

void cheat 0 const { a++; > // error 


Note that the use of const as a suffix to 0 is consistent with the use of const as a suffix to *. 


Initialization of static Members 

A static data member of a class must be defined somewhere. The static declaration in the class 
declaration is only a declaration and does not set aside storage or provide an initializer. 

This is a change from the original C++ definition of static members, which relied on implicit definition 
of static members and on implicit initialization of such members to 0. Unfortunately, this style of ini¬ 
tialization cannot be used for objects of all types. In particular, objects of classes with constructors can¬ 
not be initialized this way. Furthermore, this style of initialization relied on linker features that are 
not universally available. Fortunately, in the implementations where this used to work it will continue 
to work for some time, but conversion to the stricter style described here is strongly recommended. 

Here is an example: 

class X { 

static int i; 
int j; 

X(int); 
int read() ; 


class Y { 

static X a; 
int b; 

Y (int); 
int readO; 


Now Xsi and Ysa have been declared and can be referred to, but somewhere definitions must be 
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provided. The natural place for such definitions is with die definitions of the class member functions. 
For example: 

// file X.c: 

X: :X(int jj) { j - jj; ) 
int X::read() { return j; } 
int X::i - 3; 


// file Y.c: 

Y::Y(int bb) { b - bb; ) 
int Y::read() { return b; } 
X Y: :a - 7; 


Pointers to Members 

As mentioned in The C++ Programming Language, it was an obvious deficiency that there was no way 
of expressing the concept of a pointer to a member of a dass in C++. This lead to the need to “cheat" 
the type system in cases, such as erro r handling, where pointers to functions are traditionally u s ed . 
Consider this example: 

struct S { 

int mf (char*); 

); 

The structure S is declared to be a (trivial) type for which the member function mf() is declared. 

Given a variable of type S the function mf() can be called: 

S a; 

int i - a. mf ("hello"); 

The question is “What is the type of mf() ?" 

The equivalent type of a non-member function 

int f(char*); 


is 


int (char*) 

and a pointer to such a function is of type 
int (*)(char*) 

Such pointers to “normal" functions are declared and used like this: 
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int f(char*); 

int <*pf)(char*) - *f; 

int i - (*pf)("hello") ; 


// declare function 

// declare and initialize pointer to function 
// call function through pointer 



A similar syntax is introduced for pointers to members of a specific cl as s . In a definition mf() appears 

as: 

int S: :mf(char*) 

The type of Szmf is: 

int S:: (char*) 

that is, "member of S that is a function taking a char* argument and returning an int." A pointer to 
such a function is of type: 

int (S: :*) (char*) 

That is, the notation for pointer to member of class S is S=*. We can now write: 


// declare and initialize pointer to member function 
int (S: :*pof) (char*) -tS::mf; 


S a; 


// call function through pointer for the object ' 'a'' 
int i - (a.*praf) ("hello"); 



The syntax isn't exactly pretty, but neither is the C syntax it is modeled on. 


A pointer to member function can also be called given a pointer to an object 


// gall function through pointer for the object *'*p'': 
int i - (p->*pof) ("hello") ; 


In this case, we might have to handle virtual functions: 
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struct B 


virtual £ (); 


In¬ 


struct D : B { 
f<>; 

1; 


int ff(B* pb, int (B::*pbf)()) 

{ 

return (pb->*pbf) (); 

); 

void gg() 

{ 

D dd; 

int i - ff(£dd, SB: :£) ; 


This causes a call of DsfO. Naturally, the implementation involves a lookup in dd's table of virtual 
functions exactly as a call to a virtual function that is identified by name rather than by a pointer. The 
overhead compared to a "normal function call" is die usual about five memory re fe rence s (dependent 
on die machine architecture). 

It is also possible to declare and use pointers to members that are not functions: 



struct S 


}; 


int S::* psm 


£S::mem; 


void f(S* ps) 
{ 


ps->*psm - 2; 


void 9 () 


S a; 
ftta); 


This is a complicated way of assigning 2 to asnem. 
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User-Defined Free Store Management 

C++ provides the operators new and delete to allocate memory on the free store and to release store 
allocated this way for reuse. Occasionally a user needs a finer-grained control of allocation and deallo¬ 
cation. The first section below shows "the bad old way" of doing this and the following sections 
shows how the usual scope and overloaded function resolution mechanisms can be exploited to 
achieve similar effects more elegantly. This means that assignment to this is an anachronism and will 
be removed from the implementations of C++ after a decent interval. This will allow the type of this 
in a member function of class X to be changed to X ’const. 

Assignment to this 

If a user wanted to take over allocation of objects of a class X die only way used to be to assign to this 
on each path through every constructor for X. Similarly, die user could take control of deallocation by 
assigning to this in a destructor. This is a very powerful and general mechanism. It is also non- 
obvious, error prone, repetitive, too subtle when derived classes are used, and essentially unmanage¬ 
able when multiple inheritance is used. For example: 

class X { 

int on_free store; 

// ... 

public: 

X(); 

X(int i); 

~X(); 

// ... 

} 

Every constructor needs code to determine when to use the user-defined allocation strategy: 

X::X() { 

if (this — 0) { // 'new' used 

this - nyalloc(sizeof(X)); 
on_freejstore - 1; 

} 

else { // static, automatic, or member of aggregate 

this - this; // forget this assignment at your peril 
on_free__store • 0; 

1 

// initialize 

1 

The assignments to this are "magic" in that they suppress die usual compiler generated allocation 
code. 

Similarly, the destructor needs code to determine when to use the user-defined de-allocation strategy 
and use an assignment to this to indicate that it has taken control over deallocation: 
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X: :~X() { 

// cleanup 
if (on_free_store) { 

myfree(this); 

this -0; // forget this assignment at your peril 

} 

1 

This user-defined allocation and de-allocation strategy isn't inherited by derived classes in the usual 
way. 

The fundamental problem with the "assign to this" approach to user-controlled memory management 
is that initialization and memory management code are intertwined in an ad hoc manner. In particu¬ 
lar, this implies that the language cannot provide any help with these critical activities. 

Class-Specific Free Store Management 

The alternative is to overload the allocation function operator newO and the deallocation function 
operator deleteO for a class X: 

class X { 

II ... 

public: 

void* operator new(sizejt sz) { return rnyalloc (sz); } 
void operator delete (X* p) ( myfree (p); } 

X<) { /* initialize *1 } 

X(int i) { /* initialize *1 } 

~X() { /* cleanup *1 } 

U ... 

1 ; 

The type size t is an implementation defined integral type used to hold object sizes 4 . It is the type of 
the result of sizeof. 

Now Xsoperator newO will be used instead of the global operator new<) for objects of class X. Note 
that this does not affect other uses of operator new within the scope of X: 
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void* X: .-operator new(size_t s) 

{ 

void* p ~ new charts]; // global operator new as usual 
//... 
return p; 

) 


void X:: operator delete (X* p) 

{ 

//... 

delete (void*) p; // global operator delete as usual 

) 


When the new operator is used to create an object of dass X, operator newO is found by a lookup 
starting in X's scope so that Xsoperator newO is preferred over a global ^operator newO. 

Inheritance of operator newO 

The usual rules for inheritance apply; 

class Y : public X // objects of class Y are also allocated 

( // using X::operator new 

// ... 

1 ; 

This is the reason Xsoperator newO needs an argument specifying the amount of store to be allocated; 
sizeofCY) is typically different from sizeof(X). Naturally, a that is never a base Ha<ts need not use 
the size argument: 

void* Z::operator new(size_t) { return nert_free_Z (); } 

This optimization should not be used unless the programmer is perfectly sure that Z is never u sed as a 
base class because if it is disaster will happen. 

An operator newO, be it local or global, is used only for free store allocation so 

X al; // allocated statically 

void f() 

{ 

X a; // allocated on the stack 

X v[10]; // allocated on the stack 

1 

does not involve any operator newO. Instead, store is allocated statically and on the stack. 

Xsoperator newO is only used for individual objects of class X (and objects of classes derived from 
dass X that do not have their own operator newO) so 
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X* p - new X[10]; 


does not involve Xsoperator newO because X(10] is an array. 

Uke the global operator newO, Xsoperator newO returns a void*. This indicates that it returns unini¬ 
tialized memory. It is the job of the compiler to ensure that the memory returned by this function is 
converted to the proper type and if necessary — ini ti a lized using the appropriate constructor This 
is exactly what happens for the global operator newO. 


Xsoperator newO and Xsoperator deleteO are static member functions. In particular, they have no 
this pointer. This reflects the fact that Xsoperator newO is called before constructors so that initializa¬ 
tion has not yet happened and Xsoperator deleteO is called after the destructor so that the memory no 
longer holds a valid object of class X. ** 


Overloading operator newO 

Like other functions, operator newO can be overloaded. Every operator newO must return a void* 
and take a size_t as its first argument For example; 

void* operator newfsizejt sz); // the usual 

void* operator new(size_t sz, heap* h)// allocate from heap 'h' 

{ 

return h->alloeate (sz) ; 

1 

void* operator new(size_t, void* p)// place object at 'p' 
return p; 

1 


The size argument is implicitly provided when operator new is used. Subsequent arguments must be 
explicitly provided by the user. The notation used to supply these additional arguments is an argu¬ 
ment list placed immediately after the new operator itself- 
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static char buf (sizeof(X)]; // static buffer 

class heap { 

U ... 

1 ; 

heap hi; 


f() ( 

X* pi - new X; 


// use the default allocator 
// operator new(size_t sz): 
// operator new (sizeof (X)) 


) 


X* p3 - new (t hi) X; // use hl's allocator 

// operator new (size_t sz, heap* h): 
// operator new (sizeof (X), fchl) 

X* p2 - new (buf) X; // explicit allocation in 'buf' 

// operator new (sizeJt, void* p): 

// operator new (sizeof (X), buf) 


Note that the explicit arguments go after the new operator but before the type. Arguments after the 
type goes to foe constructor as ever. For example: 


class Y { 

void* operator new(size_t, const char*); 
Y(const char*); 

); 


Y* p - new ("string for the allocator”) Y ("string for the constructor") ; 


Controlling Deallocation 

Where many different operator new() functions are used one might imagine that one would need 
many different and matching operator deleteO functions. This would, however, be quite inconvenient 
and often unmanageable. The fundamental difference between creation and deletion of objects is that 
at the point of creation foe programmer knows just about everything worth knowing about the object 
whereas at foe point of deletion foe programmer holds only a pointer to the object This pointer may 
not even give the exact type of foe object, but only a base class type. It will therefore typically be 
unreasonable to require foe programmer writing a delete to choose among several variants . 

Consider a dass with two allocation functions and a single deallocation function that chooses the 
proper way of deallocating based on information left in foe object by foe allocators: 
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class X { 

enum { somehow, otherjway ) whichjallocator; 

void* operator new(size_t sz) 

{ void* p ” allocate_samehow() ; 

((X*) p) ->which_allocator « somehow; 
return p; 

) 


void* operator new(size_t sz , lot i) 

{ void* p - allocate_some_other_way () ; 

((X*) p) ->which_allocator — otherjway; 
return p; 

1 

void operator delete (void*); 

U ... 


Here operator deleteO can look at the information left behind in the object by the operator newO used 
and deallocate appropriately: 


void X:: operator delete (void* p) 

{ 

switch (((X*) p) ->which_allocator) ( 
case somehow: 

dea1locate_somehow() ; 
break; 
case otherjway: 

deallocate_samejotherjway () ; 
break; 

default: 


} 


/* something is funny */ 


Since operator newO and operator deleteO are static member functions they need to cast their "object 
pointers" to use member names. Furthermore, these functions will be invoked only by explicit use of 
operators new and delete. This implies that Xrwhich allocator is not initialized for automatic objects 
so in that case it may have an arbitrary value. In particular, the default case in Xsoperator deleteO 
might occur if someone tried to delete an automatic (on foe stack) object 

Where (as will often be the case) the rest of foe member functions of X have no need for examining the 
information stored by allocators for use by the deallocator this information can be placed in storage 
outside the object proper ('In the container itself') thus decreasing the memory requirement for 
automatic and static objects of class X. This is exactly the kind of game played by "ordinary" alloca¬ 
tors such as the C mallocO for managing free store. 

The example of the use of assignment to this above contains code that depends on knowing whether 
the object was allocated by new or not. Given local allocators and deallocators, it is usually neither 
wise nor necessary to do so. However, in a hurry or under serious compatibility constraints, one 
might use a technique like this: 
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class X { 

static X* last_X; 
int on_free_store; 

u ... 

X() ; 

void* operator new(long s) 

I 

return last_X - aUocate_sansehow () ; 

1 

U ... 

); 


X: :X() 

C 

if (this — last_X) ( // on free store 
on_freejstore - 1; 

> 

else { // static or automatic or member of aggregate 

on_free_store - 0; 

> 

H ... 

1 

Note that there is no simple and implementation independent way of determining that an object is 
allocated on the stack. There never was. 

Placement of Objects 

For ordinary functions it is possible to specifically call a non-member version of the function by 
prefixing a call with the scope resolution operator s. For example, 

::open(filename,"rw") ; 

rails the global openO. Prefixing a use of the new operator with - has the same effect for operator 
newO; that is, 

X* p - ::new X; 

WPP 5 a global operator newt) even if a local Xsoperator newO has been defined. This is useful for plac¬ 
ing objects at specific addresses (to cope with memory mapped I/O, etc) and for implementing con¬ 
tainer Hasses that manage storage for the objects they maintain. Using r ensures that local allocation 
functions are not used and the arguments) specified for new allows selection among several global 
operator newO functions. For example: 
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// place object: at address p: 

void* operator new(size_t, void* p) { return p; } 


char buf (sizeof(X)]; 


f() 

{ 


X* p - :: new (buf) X; 
p - ::new((void*)0777) X; 


// static buffer 


// explicit allocation in 'buf' 
// place an X at address 0777 


d f S f?- thg 8 L wlD * redundant Masses do not define their own alloca¬ 

tors. The notation s delete can be used similarly to ensure use of a global deallocator. 

Memory Exhaustion 

Occasionally, an allocator fails to find memory that it can return to its caller. If the allocator must 
nrturn mthis case it should return the value 0. A constructor will return immediately upon finding 
itself called with this==0 and the complete new expression will yield the value 0. In theabsence of 

e ” bl “ aiSc * 1 soft, '* re “ defend iaM ***** •‘k**'*" 


void f() 
{ 


X* p « new X; 

if (P ™ 0) { /* handle allocation error */ ) 
// use p 


The use of a new handler can make most such checks unnecessary. 

Explicit Calls of Destructors 

*» expKcitly "placed" at a specific address or in some other way allocated so that no 
standard deallocMor on be used, there might still be a need to destroy the object. Ibis can be done 
by an explicit call of the destructor 

P~>X: :~X(); 

The fully qualified form of the destructor's name must be used to avoid potential parsing ambiguities. 
This requirement also alerts the user that something unusual is going on. After the call of the destruc¬ 
tor, p no longer points to a valid object of x. 

Size Argument to operator deleteO 

Like Xzoperator new(), Xzoperator deleteO can be overloaded, but since there is no mechanism for the 
user to supply arguments to a deallocation function this overloading simply presents the programmer 

with a way of using the information available in die compiler. Xzoperator deleteO can havetwo forms 
(only): 
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class X { 

void operator delete(void* p); 

void operator delete(void* p, size_t sz) ; 

H ... 

}; 


If the second form is present it will be preferred by the compiler and the second argument will be the 
size of the object to the best of the compiler's knowledge. This allows a base class to provide memory 
mana g em ent services for derived cl a sses : 

class X { 

void* operator new(sizejt sz); 

void operator delete (void* p, sizejt sz); 

virtual ~X(); 

// ... 

); 

The use of a vi rtual destructor is crucial for getting the size right in cases where a user deletes an 
object of a derived dass through a pointer to the base class: 

class Y : public X { 

H ... 

~Y(); 

1 ; 


X* p - new Y; 
delete p; 


Assignment and Initialization 

C++ originally had assignment and initialization default defined as bitwise copy of an object This 
caused problems when an object of a dass with assignment was used as a member of a dass that did 
not have assignment defined: 
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class X { 

U ... 

public: 

Xt operator-(const Xt); 

n ... 

); 


class Y { 

X a; 

n ... 

>; 


void £() 

{ 

Y yl, y2; 

// ... 
yl - y2; 

} 

Assuming that assignment was not defined for Y, yia is copied into yLa with a bitwise copy. This 
invariably turns out to be an error and the programmer has to add an assignment operator to dass Y: 

class Y ( 

X a; 

U ... 

const Yt operator-(const Yt arg) 

( 

a - arg.a; 

// ... 

) 

); 


To cope with this problem in general, assignment in C++ is now defined as memberwise assignment of 
non-static members and base dass objects . Naturally, this rule applies recursively until a member of a 
built-in type is found. This implies that for a dass X, Xfconst X&) and const Xic Xsoperator— (cons t 
Xtc) will be supplied where necessary by the compiler, as has always been the case for XsXO and 
X=*X(). In principle every dass X has XsXO, X=X(const X&), and Xroperator=(const X&) defined. In 
particular, defining a constructor X=X(T) where T isn't a variant of X& does not affect the fact that 
X=X(const Xic) is defined. Similarly, defining Xnoperatoi-CT) where T isn't a variant of X& does not 
affect the fact that Xnoperatorsfconst X&) is defined. 

To avoid nasty inconsistencies between the predefined operator=() functions and user defined opera¬ 
tor^) functions, operator=() must be a member function. Global assignment functions, such as 
roperator(X&,X&) are anachronisms and will be disallowed after a decent interval. 

Note that since access controls are correctly applied to both implicit and explicit copy operations we 
actually have a way of prohibiting assignment of objects of a given rla« X: 
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class X { 


// Objects of class X cannot be copied 
// except by members of X 
void operator-(X&); 

X(Xfi); 

// ... 


public: 


X(int); 
II ... 


void f<) { 

X a(l); 


b - 


X b - 


a; 


a; 


// error: X: :X(X&) private 
// error: X::operator-(Xt) private 


The automatic creation of XsXfconst X&) and Xnoperator*<const X&) has interesting implications on 
the legality of some assignment operations. Note that if X is a public base class of Y then a Y object is 
a legal argument for a function that requires an X&. For example: 

class X { public: int aa; }; 

class Y : public X ( public: int bb; }; 


void f () { 



// ok: a Y is an X 

// xx—yy; means xx.operator*((XS)yy) ; 

// and is optimized to xx.aa - yy.aa 


D efinin g assignment as memberwise assignment implies that opexatar=0 isn't inherited in the ordinary 
manner. Ins tead, the appropriate assignment operator is — if necessary — generated for each dass. 
This implies that the "opposite" assignment of an object of a base dass to a variable of a derived dass 
is illegal as even 


void f () { 



// error: an X is not a Y 


YY “ 


The extension of the assignment semantics to allow assignment of an object of a derived class to a vari¬ 
able of a public base dass had been repeatedly requested by users. The direct connection to the recur¬ 
sive memberwise assignment semantics became clear only through work on the two apparently 
independent problems. 
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Operator -> 


Until now -> has been one of the few operators a programmer couldn't define. This made it hard to 
create classes of objects intended to behave like "smart pointers." When overloading, -> is considered 
a unary operator (of its left hand operand) and -> is reapplied to the result of executing operator->0. 
Hence the return type of an operator- >0 function must be a pointer to a class or an object of a class 
for which operator->0 is defined. For example: 

struct Y { int m; ); 

class X { 

Y* p; 

// ... 

Y* operator->() { 

If (p — 0) { 

// initialize p 

) 

else ( 

// check p 

> 

return p; 

1 

II ... 

); 


Here, class X is defined so that objects of type X act as pointers to objects of Y, except that some 
suitable computation is performed on each access. 

void f(X x, X£ xr, X* xp) 

< 

x->m; 
xr->m; 
xp->m; 

) 

Like operatorsO, operatorQO, and operatorOO, operator^>0 must be a member function (unlike 
operator+0, operator-0, operator<0, etc., which are often most useful as Mend functions). 

The dot operator still cannot be overloaded. 

For ordinary pointers, use of -> is synonymous with some uses of unary • and 0- For example, for 
Y* p; 


// x.p->tn 
// xr.p->m 

// error: X does not have a member m 


it holds that: 

p->m — (*p) .m — p[0] .m 

As usual, no such guarantee is provided for user-defined operators. The equivalence can be provided 
where desired: 
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class X { 

Y* p; 

public: 

Y* operator->() { return p; } 

Y4 operator*() { return *p; } 

Y4 operatorU (int i) { return pti]; ) 

1 ; 


If you provide more than one of these operators it might be wise to provide the equivalence exactly as 
it is wise to ensure that x+=l has the same effect as x*x+l for a simple variable x of some class if +=, 
«, and + are provided. 

The overloading of -> is important to a class of interesting programs, just like overloading 0, and not 
just a minor curiosity. The reason is that indirection is a key concept and that overloading -> provides 
a dean, direct, and efficient way of representing it in a program. Another way of looking at operator 
-> is to consider it a way of providing C++ with a limited, but very useful, form of delegation. 


Operator, 

Until now the comma operator, has been one of the few operators a programmer couldn't define. 
This restriction did not appear to have any purpose so it has been removed. The most obvious use of 
an overloaded comma operator is list building: 

class Xlist { 

// ... 

public: 

Xlist (); 

Xlist(Xt); 

Xlistt operator,(X&); 
friend Xlist operator, (X£,X&); 

); 


void £ 0 
{ 

X a,b,c; 

Xlist xl - (a,b,c); // meaning operator, (a,b) .operator, (c) 

1 

If you have a bit of trouble deciding which commas mean what in this example you have found the 
reason overloading of comma was originally left out 
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Initialization of static Objects 


In C, a static object can only be initialized using a slightly extended form of constant expressions. In 
C++, it has always been possible to use completely general expressions for the initialization of static 
class objects. This feature has now been extended to static objects of all types. For example: 

finclude <math.h> 

double sqrt2 - sqrt (2); 

main() 

1 

if (sqrt(2)!-sqrt2) abort(); 

> 

Such dynamic initialization is done in declaration order within a file and before foe first use of any 
object or function defined in the file. No order is defined for initialization of objects in different source 
files except that all static initialization takes place before any dynamic initialization. 


Resolutions 

This section does not describe additions to C++ but gives answers to questions that have been asked 
often and do not appear to have dear enough answers in foe reference manual of The C++ Program¬ 
ming Language. These resolutions involve slight changes compared to earlier rules. This was done to 
bring C++ doser to foe ANSI C draft. 

Function Argument Syntax 

Like the C syntax, the C++ syntax for specifying types allows the type int to be implidt in some ca ses 
This opens the possibility of ambiguities. In argument declarations, C++ chooses the longest type pos¬ 
sible when there appears to be a choice: 

typedef long I; 

£1 (const I); // fl () takes an unnamed 'const long' argument 

£2 (const i); // £2 () takes a 'const int' argument (called 'i') 

This rule applies to the const and volatile specifiers, but not to unsigned, short, long, or signed 7 : 

£3 (unsigned int I); // ok 

£4 (unsigned I); // ok: equivalent to £4 (unsigned int I); 

A type cannot contain two basic type spedfiers so 

£5(char I) ( I++; > 
f6(I I) ( I++; ) 

are legal. 


Evolution of C++ 


1-41 







The Evolution of C++: 1985 to 1989 


Declaration and Expression Syntax 

There is an ambiguity in the C++ grammar involving expression-statements and declarations: An 
expression-statement with a "function style" explicit type conversion as its leftmost sub-expression can 
be indistinguishable from a declaration where the first declarator starts with a (. For example: 

T (a); //declaration or type conversion of 'a' 

In those cases the statement is a declaration. 

To disambiguate, the whole statement may have to be examined to determine if it is an expresskm- 
statement or a declaration. This disambiguates many examples. For example, assume T is the name of 
some type: 


T (a) ->m - 7; 

// 

expression-statement 

T (a) ++; 

// 

expression-statement 

T(a,5)«C; 

// 

expression-statement 

T(*d) (double(3)); 

// 

expression-statement 


T(*e)(int); 

// declaration 

T(f) U; 

// declaration 

T(g)-{ 1,2 1; 

// declaration 

remaining cases are declarations. For example: 

T(a); 

// declaration 

T(*b) (); 

// declaration 

T(c)-7; 

// declaration 

T(d),e,f-3; 

// declaration 

T(g) (h, 2); 

// declaration 


The disambiguation is purely syntactic that is, the meaning of the names, beyond whether they are 
names of types or not, is not used in the disambiguation. 

This resolution has two virtues compared to alternatives: It is simple to explain and completely compa¬ 
tible with C The main snag is that it is not well adapted to simple minded parsers, such as YACC, 
because the lookahead required to decide what is an expression-statement and what is a declaration in a 
statement context is not limited. 

However, note that a simple lexical lookahead can help a parser disambiguate most cases. Consider 
analyzing a statement; die troublesome c a se s look like this: 

T ( d-or-e ) tail 

Here, d-or-e must be a declarator, an expression, or both for the statement to be legal. This implies that 
tail must be a semicolon, something that can follow a parenthesized declarator or something that can 
follow a parenthesized expression. That is, an initializer, const, volatile, (, or [ or a postfix or infix 
operator. 

A user can explicitly disambiguate cases that appear obscure. For example: 
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void f() 

{ 

auto int(*p) (); // explicitly declaration 

(void) int(*p)();// explicitly expression-statement 
0 ,int(*p) (); // explicitly expression-statement 

(int(*p)()); // explicitly expression-statement 

int (*P) 0; // resolved to declaration 


Enumerators 


An enumeration is a type. Each enumeration is distinct from all other types. The set of possible 
V»l u c lor .n enumeration is is set of enumer.tos. Ihe of .nu^tor is £ZZZSL. 


enum wine { red, white, rose, bubbly }; 
enum beer { ale, bitter, lager, stout }; 

defines two types, each with a distinct set of 4 values. 


wine w - red; 


beer b - bitter; 


w - b; 

// 

w - stout; 

// 

w - 2; 

// 


error, type mismatch: 
error, type mismatch: 
error, type mismatch: 


beer assigned to wine 
beer assigned to wine 
int assigned to wine 


Each enumerator has an integer value and 
the integer value is used: 


can be used wherever an integer is required; in such cases 


int i - rose 
i - b; 


// the value of 'rose' (that is, 2) is used 
// the value of 'b' is assigned to 'i' 


Thismtejpretation is stricter than what has been used in C++ until now and stricter than most C 

VaS ^ Cs **1™*™' enumerations be distinct types. 

dek , f °“° W fTOm **?**** on type checking and the requirements of com 
sistency to allow overloading, etc For example: 
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int £ (int) ; 
int f (wine) ; 

void g() 

{ 


f (i); 

// 

f (int) 


f (w) ; 

// 

f(wine) 


f (1); 

// 

f (int) 


£ (white); 

// 

f (wine) 


£ (b); 

// 

f (int), 

standard conversion 

// 


from beer to int used 


C++'s checking of enumerations is stricter than ANSI C's, in that assignments of integers to enumera¬ 
tions are disallowed. As ever, explicit type conversion can be used: 

w - wine (257); /* caveat enjptor V 

An enumerator is entered in the scope in which the enumeration is defined. In this context, a class is 
considered a scope and the usual access control rules apply. For example: 

class X { 

enum { x, y, x }; 

// ... 

public: 

enum { a, b, c ); 

f(int i • a) { g(i+x); ... 1 
II ... 

1 


void h() { 

int i “ a; 
i - X: :a; 
i - X: :x; 

1 

The const Specifier 

Use of the const specifier on a non-local object implies that linkage is internal by default; that is, the 
object declared is local to the compilation in which it occurs. To give it external linkage it must be 
explicitly declared extern. 

Similarly, inline implies that linkage is internal by default 
External linkage can be obtained by explicit declaration: 


// error: % X: :a' is not in scope 
// 6k 

// error: % X: :x' is private 
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extern const double g; 
const double g - 9.81; 

extern inline f(int); 

inline f(int i) { return i+c; } 


Function Types 

It is possible to define function types that can be used exactly like other types, except that variables of 
function types cannot be defined — only variables of pointer to function types: 

// fu nctio n taking a char* argument 
// and returning an int 
// pointer to such function 
// error: no variables of function type allowed 

Function types can be useful in friend declarations. Here is an example from the C++ task system: 

class task : public scheduler { 

friend SIG_FONC_TXP sig_func; // the type of a function must be specified 

// in a friend function declaration 

H ... 

1 

The reason to use a typedef in the friend declaration sigjfunc and not simply to write the type 
directly is that the type of signalO is system dependent 

// BSD signal.h: 

typedef void SIG_FUNC_TXP (int, int, sigcontext*) ; 


typedef int F(char*>; 

F* pf; 

F f; 


// 9th edition signal.h: 
typedef void SIG_FONC_TYP(int); 

Using the typedef allows the system dependencies to be localized where they belong: in the header 
files defining the system interface. 

Lvalues 

Note that the default definition of assignment of an X as a call of 
Xt operator-(const Xfi) 

makes assignment of Xs produce an lvalue. For uniformity, this rule has been extended to assign¬ 
ments of built-in types. By implication, +=, -=, *«, etc, now also produce lvalues. So - again by 
implication - does prefix ++ and — (but not the postfix versions of these operators). 

In addition, the comma and ?: can also produce lvalues. The result of a comma operation is an lvalue 
if its second operand is. The result of a ?: operator is an lvalue provided both its second and third 
operands are and provided they have exactly the same type. 
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Multiple Name Spaces 

C provides a separate name space for structure tags whereas C++ places type names in the same name 
sp are as other names. This gives important notational conveniences to the C++ programmer but 
severe headaches to people managing header files in mixed C/C++ environments. For example: 

struct stat { 

// ... 

); 

extern struct stat(lnt, struct stat *); 

was not l egpl C++ though early implementations accepted it as a compatibility hack. The experience 
has been that trying to impose the "'pure C++" single name space solution (thus outlawing examples 
such as the one above) has caused too much confusion and too much inconvenience to too many users. 
Consequently, a slightly cleaned up version of the C/C++ compatibility hack has now become part of 
C++. This follows foe overall principle that where there is a choice between inconveniencing compiler 
writers a nd annoying users, foe compiler writers should be inconvenienced. It appears that foe 
compromise provided by the rules presented below enables all accepted uses of multiple name spaces 
in C while preserving the notational convenience of C++ in all cases where C compatibility isn't an 
issue. In particular, every legal C++ program remains legal. The restrictions on the use of 
constructors and typedef names in connection with the use of multiple name spaces are imposed to 
prevent some nasty re«*s of hard to detect ambiguities that would cause trouble for the composition of 
C++ header files. 

A typedef can declare a name to refer to the same type more than once. For example: 

typedef struct s {/*...*/ 1 s; 
typedef s s; 


A name s can be declared as a type (struct, class, union, enum) and as a non-type (function, object, 
value, etc) in a single scope. In this case, foe name s refers to foe non-type and struct s (or whatever) 
can be used to refer to the type. The order of declaration does not matter. This rule takes effect only 
after both declarations of s have been seen. For example: 

struct stat { /* ... */ }; 
stat a; 

void stat (stat* p); 
struct stat b; 
stat(0); 


int f(int); 
f (In¬ 
struct f ( /* ... */ I; 

str uct f a; // struct is needed to avoid the function name 

A name cannot simultaneously refer to two types: 


// struct is needed to avoid the function name 
// function call 
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struct s {/*...*/ }; 

typedef int s; // error 

The name of a class with a constructor cannot also simultaneously refer to something e lse: 


struct s { s(); /* ... 

V I; 


int s (); 


// error 

struct t* p; 

int t(); 
int i - t (); 


// ok 

struct t { t(); /* ... 
i - t(); 

V 1 

// error 


If a non-type name s hides a type name s, struct s can be used to refer to the type name. For example: 

struct s { /* ... */ }; 
f(int s) { struct s a; s++; } 

Note: If a type name hides a non-type name the usual scope rules apply: 

int s; 

£() 

{ 

struct s (/*...*/ ); // new 's' refers to the type 

// and the global int is hidden 

s a; 

1 


Use of the s scope resolution operator implies that its argument is a non-type name. For ovamp i g- 

int s; 
f() 

{ 

struct s { /* ... */ ); 
s a; 

:: s ■ a; 

1 


Function Declaration Syntax 

To ease use of common C++ and ANSI C header files, void may be used to indicate that a function 
takes no arguments: 

extern int f (void) ; // same as * 'extern int f (); " 
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Conclusions 


C++ is holding up nicely under the strain of large scale use in a diverse range of application areas. 

The extensions added so far have been have all been relatively easy to integrate into the C++ type sys¬ 
tem. The C syntax, especially the C declarator syntax, has consistently caused much greater problems 
that the C semantics; it remains barely manageable. The stringent requirements of compatibility and 
maintenance of the usual run-time and space efficiencies did not constrain the design of the new 
features noticeably. Except for the introduction of the keywords catch, private, protected, signed, 
template, and volatile, the extensions described here are upward compatible. Users will find, how¬ 
ever, that type-safe linkage, improved enforcement of const, and improved handling of ambiguities 
will force modification of some programs by detecting previously uncaught errors. 
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Footnotes 


1. Surprisingly, giving character constants type char does not cause incompatibilities with C where 
they have type int. Except for the pathological example sizeofl'a'), every construct that can be 
expressed in both C and C++ gives the same result The reason for the surprising compatibility 
is that even though character constants have type int in C, the rules for determining the values 
of such constants involves the standard conversion from to int. 

2. The strategy for dealing with ambiguities in inheritance DAGs is essentially the same as the stra¬ 
tegy for dealing with ambiguities in expression evaluation involving overloaded operators and 
user-defined coercions. Note that the access control mechanism does not affect the ambiguity 
control mechanism. Had BsfO been private the call p->fO would still be ambiguous. 

3. Virtual base classes force a modification to this rule; see below. 

4. operator newO used to require a long; size t was adopted to bring C++ allocation mechanisms 
into line with ANSI C 

5. The requirement that a programmer must distinguish between delete p for an individual object 
and deleteln] p for an array is an unfortunate hack and is mitigated only by the fact that there 
is nothing that forces a programmer to use such arrays. 

6. One could argue that the original definition of C++ was inconsistent in requiring bitwise copy of 
objects of class Y, yet guaranteeing that XnoperatoM) would be applied for copying objects of a 
class X. In this case both guarantees cannot be fulfilled. 

7. This resolution involves a slight change compared to earlier rules. This was done to bring this 
aspect of C++ into line with foe ANSI C draft 
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An Introduction to C++ 


NOTE 


This chapter is taken directly from a paper by Keith Gorien. 


The C++ programming language was designed and implemented by Bjame Stroustrup of AT&T Bell 
Laboratories as a successor to C 1 . It retains compatibility with existing C programs and die efficiency 
of C It also adds many powerful new capabilities, making it suitable for a wide range of appli cation s 
from device drivers to artificial intelligence. C++ will be of interest to UNIX users because of its inti¬ 
mate relation to C and its potential use for building graphical user interfaces to UNIX, for UNIX sys¬ 
tems programming, and for supporting large-scale software development under UNDC 

C++ evolved from a dialect of C known as "C with Classes," which was invented in 1980 as a 
language for writing efficient event-driven simulations. Several key ideas were borrowed from the 
Simula67 and Algol 68 programming languages. Figure 2-1 shows the heritage of C++. 

Figure 2-1: Tha Heritage of C++ 



The definitive book on C++ is Bjame Stroustrup's The C++ Progra mmi ng Language, which gives a 
detailed description of die language and contains many examples and exercises. It also indudes die 
C++ reference manual, which is a concise, more formal definition of the language. 

In this chapter, well see how C++ corrects most of the deficiencies of C by offering improved 
compile-time type checking and support for encapsulation. We'll also introduce you to many of die 
new features of C++: 

■ classes 

■ type checking 

■ operator and function name overloading 
a free store management 
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■ constant types 

■ references 

■ inline functions 

■ derived classes 

■ virtual functions 

We'll present these features in the context of a non-trivial example so that you'll understand the 
motivation behind them and see how they are typically used. 

By the end of the paper, you'll see how proper use of C++ can dramatically increase a programmer's 
productivity. C++ programs are shorter, dearer, and more likely to be correct from the outset As a 
result, they are also easier to debug and to maintain. 

Well condude the paper tty discussing die current status and future of C++. 
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A C++ Example 


The best way to team about C++ is to write a program in it, and that is what we'll do in the next three 
sections. Let's start in familiar territory by taking a look at a simple program written in plain old C: 

main () 

{ 

int a - 193; 

int b - 456; 

int c - a + b + 47; 

printf ("%d\n",c); 

) 


This program declares three integer variables named a, b, and c, initializing a and b to the values 193 
and 456, respectively. The integer c is initialized to the result of adding a and b and the 47. 

Finally, the standard C library function printfO is called to print out die value of c The quoted string 
"%dVn" tells how to print the result %d prints c as a decimal number, and > m adds a newline character. 
If we compile and execute this program, it prints out die number 696 and exits. 

Now suppose we wish to per fo rm a similar calculation, but this time a and b are big numbers, like the 
U. S. national debt expressed in dollars. Such numbers are too big to be stored as bats on most com¬ 
puters, so if we tried to write int a * 25123654789456 die C compiler (hopefully!) would give us an 
error message and fail to compile die progr am . There are many practical applications for big integers, 
such as cryptography, symbolic algebra, and number theory, where it can be necessary to perform 
arithmetic on numbers with hundreds or even thousands of digits. 

It isn't easy to write a program to deal with these big numbers in ordinary C Coding and debugging 
the algorithms that perform arithmetic operations on big integers in C involves a si gnif i ca nt amount of 
work, so we'd want to make them general-purpose. We wouldn't be able to predict how big die 
numbers might become in advance, so we would have to use a dynamic memory allocator to manage 
their storage at execution time. We'd need to write a C library of functions for creating, destroying, 
reading, printing, assigning, and performing basic arithmetic on big integers. These functions would 
have to have distinctive names such as create_biginf, printbigint, and add_bigints to avoid confusion 
with other kinds of data that we might want to create, print, or add in the mine program. 

Worst of all, programmers wishing to use our big integers would have to know the names of th ese 
functions and die rules for calling diem. They would have to remember to create and initialize big 
integers when they needed to use them, and to destroy them when they were finished. Even simple 
arithmetic expressions would be awkward to write; c = a+b would have to be coded as: 

assignjbigittt (sc, add_blgints (a, b)) 

and there might be problems with handling temporary results calculated during the evaluation of a 
complex expression. Also, programmers would have to be careful when combining trig integers with 
other data types such as int They would need to call a function to convert ints to trig integers before 
adding them, for example. Any C program using big integers would be both difficult to write and 
difficult to read. 

In C++, we still must write die code to manage the storage of big integers and functions to perform 
the same operations on them. The difference is that C++ lets us "package" this code so that using our 
big integers is as convenient as using the int data type that is built into C We can, in effect, extend 
the C++ language by adding our own custom data type, which we'll call Biglnt Notice how similar 
the example C program is to this C++ program which performs a similar calculation using Biglnts 


An Introduction to C++ 


2-3 








A C++ Example 


instead of ints: 

♦include "Biglnt.h" 
main () 

{ 

Biglnt a - "25123654789456"; 

Biglnt b - -456023398798362-; 

Biglnt c - a + b + 47; 
c.printO;/* print the result, c */ 
printf ("\n") ; 


Data Abstraction 

This technique of defining new data types that are well-suited to the application to be programmed is 
known as data abstraction, and a data type such as Biglnt is called an abstract data type. Data abstrac¬ 
tion is a powerful, general-purpose technique which, when properly used, can result in shorter, more 
readable, more flexible programs. 

Data abstraction is supported by several other modem programming languages such as Ada. 

In these languages, and in C++ as well, a programmer can define a new abstract data type by specify¬ 
ing a data structure together with die operations permissible on that data structure, as shown in Figure 
2 - 2 . 


Figure 2 -2: An Abstract Data Type 
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A Cm- Example 


It is difficult or impossible to practice data abstraction in most other p ro gra mming lang ua ges currently 
in widespread use, such as BASIC, C COBOL, FORTRAN, PASCAL, or Modula-2. This is because 
data abstraction requires special language features not available in these languages. To get an idea of 
what these features do, let's analyze the example C-m- program. 


The first three statements in the body of the mainO program declare three type Biglnt variables, a, b, 
and c. The Cm compiler needs to know how to create them — how much space to allocate for them 
and how to initialize diem. 


The first and second statements are similar; they initialize die Biglnt variables a and b with big integer 
constants written as character strings containing only digits. To do this the Cm compiler must be able 
to convert character strings into Biglnts. 

The third statement is die most complicated. It adds a, b, and the integer constant 47 and stores the 
result in c The Cm compiler needs to be able to create a temporary Biglnt variable to hold the sum 
of a and b. Then it must convert die int constant 47 into a Biglnt and add this to die temporary vari¬ 
able. finally, it must initialize c from this temporary Biglnt variable 

The fourth statement prints c on the standard output, and the last statement caii« the C library func¬ 
tion printfO to print a newline character. C progra m mer s are probably familiar with printfO, but 
cprintO probably looks a bit strange It is a call on a special kind of function available in Cm called a 
member function. We'll talk more about this later, but for now just think of it as a function that prints 
out a variable of type Biglnt 


Even though there are no more statements in die body of mainO, the compiler isn't finished yet It 
must destroy the Biglnt variables a, b, and c and any Biglnt te mpo rari es it may have created before 
leaving a function, such as nudnO. This is to assure dud the storage by these variables is freed. 


Let's summarize what die Cm 
pie program: 


compiler needs to know how to do with Biglnts to compile die exam- 


■ create new instances of Biglnt variables 

■ convert character strings and integers to Biglnts 

■ initialize the value of one Biglnt with that of another Kglnt 

■ add two Biglnts together 

■ print Biglnts 

■ destroy Biglnts when they are no longer needed 


Specifications and Implementations 



Where does the Cm compiler obtain this know-how? from the file Biglnth, which is included by the 
first line of die example program. This file contains die specification of our Biglnt abstract data type. 
The specification contains the information that programs that use an abstract data type need to have to 
be successfully compiled. The details of how the abstract data type works, known as the implementa¬ 
tion, are kept in another file. In our example, this file is named Biglntc. It is compiled separately, 
and the object code produced from it is linked with the program that uses the abstract data type, also 
called the client program, figure 2-3 shows how die specification and implementation of an abstract 
data type are combined with the source code of a client program to produce an executable program. 
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A Cm-E xample 


Figure 2-3: Combining the specification (BigtnLh) and Implement a t i on (Blgfntc) of an abstract data type 
(Biglnt) with the source code of a client program (cflentx) to produce an executable 
programfdlent). 



We separate die code for an abstract data type into a specification part and an imp leme n t a tion part to 
hide foe implementation details from foe client We can then change foe implementation and be 
confident that client programs will continue to work correctly after they are relinked with foe modified 
object code. This is useful when a team of progr am mers work on a large software project Once they 
agree on the specifications for foe abstract data types they need, each team member can implement one 
or more of them independently of foe rest of foe team. 

A well-designed abstract data type also hides its complexity in its implementation, making it as easy 
as possible for clients to use. 
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The Specification 


Let's take a look at the specification for our Biglnt data type, contained in the file Biglnth. (Note that 
in C++, // begins a comment that extends to the end of foe line.) 


class Biglnt { 


public: 


char* digits; 

// pointer to digit array in free store 

int ndigits; 

// number of digits 

Biglnt(const char*); 

// constructor function 

Biglnt(int); 

// constructor function 

Biglnt(const Biglnt&); 

// initialisation constructor function 

Biglnt operators(const BiglntS); 

// addition operator function 

void print (); 

// printing function 

-Biglnt (); 

// destructor function 


1 ; 


Much of this code may look odd, but well explain it as we cover foe features of C++ in foe few 

sections. 


Classes 


This is an example of one of foe most import an t features of C++, foe dass declaration, which 
an abstract data type. It is an extension of something C programmers are probably already familiar 
with: foe struct declaration. 

The struct declaration groups together a number of variables, which may be of different types; into a 
unit For example, in C (or in C++) we can write: 

struct Biglnt ( 

Char* digits; 
int ndigits; 

); 

We can then declare an instance of this structure by writing: 
struct Biglnt a; 

The individual member variables of foe struct, digits and ndigits, can be acce s se d using the dot (.) 
operator; for example, a. digits, accesses foe member variable digits of foe struct a. 

Recall that in C we can also declare a pointer to an instance of a structure: 
struct Biglnt* p; 

in which case we can access the individual member variables by using foe -> operator; for example, 
p->digits. 
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The Specification 


C++ classes work in a similar manner, and the . and -> operators are used in the same way to access a 
class s member variables. In our example, class Biglnt has two member variables named digits and 
ndigits. The variable digits points to an array of bytes (chars), allocated from the free storage area 
that holds the digits of the big integer, one decimal digit per byte. The digits are ordered badnning 
with the least significant digit in the first byte of the array. The member variable ndigits contains the 
number of digits in die integer. Figure 2-4 shows a diagram of this data structure for foe number 


Figure 2-4: A diagram of the Biglnt data structure for the number 654321 


ndigits 



However, the C++ class can do much more than foe struct feature of regular C We'll now look at 
these extensions in detail. 


Encapsulation 


In C++, a client program can declare an instance of cla« Biglnt by writing; 
Biglnt a; 


But now we have a potential problem: the client program might try, for example, to use the fact that 
andigits contains foe number of digits in the number a. This would make the dient program depen* 
dent on the implementation of dass Biglnt — after all, we might wish to change the representation of 
Biglnts to use hexadecimal instead of decimal arithmetic to save storage. We need a way to prevent 
unauthorized access to foe member variables of foe instances of a dass. C++ provides this by allowing 
the use of foe keyword public within a class declaration to indicate which members can be accessed 
by anyone and which have restricted access. Members declared before the public keyword are private 
as are digits and ndigits in this example, so C++ will issue an error message if a dient program 
attempts to use them. 

Protecting foe member variables of a dass in this manner is known as enca psulatio n It is a good pro¬ 
gramming practice because it enforces the separation between foe specification and the implementation 
of abstract data types that we are trying to achieve, and it helps when debugging programs. For 
example, if we find -that ndigits has the wrong value in some situation, those parts of the program that 
do not have access to the variable are probably not at fault 
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Member Functions 


But how does a client program interact with the private member variables of a class? Whereas a street 
allows only variables to be grouped together, the C++ class declaration allows both variables and the 
functions that operate on them to be grouped. Such functions are called member functions, and the 
private member variables of the instances of a class can be accessed only by die number functions of 
that class. Thus, a client program can read or modify the values of the private member variables of an 
instance of a class indirectly, by calling the public member functions of die class, as shown in figure 
2-5. 


Figure 2-5: Client program* can access tha private member variables of an instance of a class only by 

calling public member functions of tha class. 


Instances of Class Biglnt 



Our example class Biglnt has two private member variables, digits and and six public 

member functions. The declarations of these member functions will look unusual to C programmers 
for several reasons: the types of the arguments of the functions are listed within parentheses in the 
function declarations, three of the functions declared have the same name, Biglnt, and the function 
names operator+ and 'Biglnt contain characters normally not allowed in function names. 
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Function Argument Type Checking 

, C '^' tD check for inconsistent argument types when a function call is compiled 

and can eliminate many bugs at an early stage. For example, the C statement ^ 

fprintf("The answer is %d",x); 

will compile with no problem. However, when this statement is executed the program will abort with 
a cryptic error message. The problem is that the standard C library function fprintfO expects the first 

here. On the other hand, in C++ we can declare fire argument types of fprintfO: * 


ext ern int fprintf(TILE*, const char*. 


• ); 


so the compiler can give us an error message when we try to compile the incorrect function call, not¬ 
ing the discrepancy in the argument types. Conveniently, the argument types for most standard 
library functions are declared in system header files that you can include in your programs so that you 
don t have to write all these common declarations yourself. 


Function Name Overloading 


Ustmg die types of all of a function's arguments in its declaration has a second benefit: we am define 
several functions with the same name, as long as each requires a different number and/or type of 
argument For example, in C++ we can declare two functions with the name abs: 

int abs (int); 
float abs(float); 

We can then write: 

x - abs (2); 
y - abs(3.14) ; 

The first statement will call absfint), and the second will call abs(fioat) — the C++ compiler knows 
which abs to use because 2 is an int and 3.14 is a float When more than one function has the same 
name like this, die name is said to be overloaded. One advantage of overloading is that it eliminates 
"funny" function names (remember ABS, IABS, DABS, and CABS from FORTRAN?). It also leads to 
more general programs; for example, we can write copy(x,y) to copy a y to an x without ha vine to 
wony about their types — they might be arrays, or strings, or files — as long as we have written a 
copy function to handle each case. 
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Calling Member Functions 


Getting back to our Biglnt example and our discussion of member functions, we can now explain the 
next-to-last line in our first C++ program which is: 

c.printO; 


Member functions are called in a manner analogous to the way member variables are normally 
accessed in Q that is, by using the. or -> operators. Since c is an instance of Ha« Biglnt, the notation 
cprintO calls the member function printO of class Biglnt to print die current value of c Similarly, if 
we declared a pointer to a Biglnt 

Biglnt* p; 

then die notation p->prixttO would call the same function. This notation prevents this pa-i/ni*- 
printO from inadvertently being called to operate on anything other than an instance of dass 

In C++, several different cl a sses may all have member functions with die same name, just as in regular 
C several different streets may all have member variables with die same name. Hus lets us use simple 
function names, like print, rather than distinctive names, like print_bigint, without worrying about 
naming conflicts. We could add a new dass, say BigFloat, to a program that also used Biglats, and 
we could also define printO as a member function of dass BigFloat Our program could contain die 
statements: 

Biglnt a - "2934673485419"; 

BigFloat x - "874387430.3945798"; 

a.psintO; 

x.print (); 

and the C++ compiler would use the appropriate printO in both cases 


Constructors 


As you'll recall, one of the things die C++ compiler needs to know about our Biglnt abstract data type 
is how to create new instances of Biglnts. We can tell C++ how we want this done by defining one or 
more special member functions called constructors. A constructor function is one which has the same 
name as its dass. When a client program contains a declaration such as: 

Biglnt a - "123"; 

the C++ compiler reserves space for the member variables of an instance of class Biglnt and calls die 
constructor function a.BigIntC‘123"). It is our responsibility as providers of die Biglnt data type to 
write die function BiglntO so that it initializes the instance correctly. In our well have 

Biglnt("123") allocate three bytes of dynamic storage, set a^ligits to point to this storage, set die three 
bytes to {3,2,1}, and set amdigits to three. This will create an instance of dass Biglnt named a that is 
initialized to 123. 
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If a class has a constructor function, C++ guarantees that it will be called to initialize every instance of 
the class that is created. A user of an abstract data type such as Bigint does not have to remember to 
call an initialization function separately for every Biglnt declared, thus eliminating a common source 
of programming errors. 


Constructors and Type Conversion 

The second thing C++ needs to know is how to convert something that is a character string, such as 
"25123654789456", or an integer, such as 47, to a Biglnt Constructors are also used for this purpose. 
When the C++ compiler sees a statement like: 

Biglnt c - a + b + 47; 

it recognizes that the int 47 must be converted to a Biglnt before the addition can be done, and so 
checks to see if the constructor Biglntfint) is declared. If so, it creates a temporary instance of Biglnt 
by calling Bigtattfnt) with foe argument 47. If an appropriate constructor is not declared, foe state¬ 
ment is flagged as an error. We have defined Biglntfchar*) and Biglntfint) for Biglnt, so we 
may freely use character strings or integers wherever a Biglnt can be used, and the C++ compiler will 
automatically call our constructor to do the type conversion. This is an important feature of C++ 
because it lets us blend our own abstract data types with others and with the fundamental types built 
into the language. 


Constructors and Initialization 

The third thing C++ must know how to do is how to initialize a Biglnt with foe value of another 
Biglnt, as is required by a statement such as: 

Biglnt c • a + b + 47; 


The Biglnt c must be initialized with the value of a temporary Biglnt that holds foe result of the 
expression a+ b + 47. 

We can control how C++ initializes instances of class Biglnt by defining foe special constructor func¬ 
tion BiglntCconst Biglnt&l. In our example, we'll make this constructor allocate storage for the new 
instance and make a copy of the contents of the argument instance. 


Operator Overloading 

The fourth thing C++ must be aide to do is to add two Bigints. We could just define a member func¬ 
tion named add to do this, but then writing arithmetic ex p re ssi ons would be awkward. C++ lets us 
define additional meanings for most of its operators, including +, so we can make it mean "add" when 
applied to Biglnts. This is known as operator overloading, and is similar to the concept of function 
name overloading. 
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Actually, most programmers are already familiar with this idea because the operators of most pro¬ 
gramming languages, including C, are already overloaded. For example, we can write: 

int a,b,c; 
float x,y,z; 
c - a+b; 
z - x+y; 

The operators * and + do quite different things in the last two statements: the first statement does 
integer addition and assignment and the second does floating point addition and assignment Operator 
overloading is simply an extension of this. 

C++ recognizes a function name having the form operator® as an overloading of the C++ operator 
symbol We can overload the opera to r +, for example, by declaring foe member function named 
operator+, as we have done in our example class Biglnt We can call this function using either foe 
usual notation for calling member functions or by using just foe operator 

Biglnt a,b,c; 
c - a. oper a tor+ (b); 
c - a + b; 

The last two lines are equivalent. 2 

Of course, if we overload an operator, we don't change its built-in meaning, we only give it an addi¬ 
tional meaning when used on instances of our new abstract data type. The expression 2+2 still gives 4. 


Destructors 

The last thing we said was that C++ needed to know how to destroy instances of our Biglnts once it 
was finished with them. We can tell the C++ compiler how to do this by defining another special kind 
of member function called a destructor. A destructor function has the same name as its class, prefixed 
by the character ". For class Biglnt this is foe member function "BiglntO. Since " is the C++ and C 
complement operator, this naming convention suggests that destructors are complementary to con¬ 
structors. 

We must write foe function "BiglntO so that it properly deans-up, or finalizes instances of dass Biglnt 
for which it is called. In our example, this means freeing foe dynamic storage that was allocated by 
the constructor. 

If a class has a destructor function, C++ guarantees that it will be called to finalize every instance of the 
class when it is no longer needed. Once again, this relieves users of an abstract data type like Biglnt 
from having to r emember to do this, and eliminates another source of programming errors. 
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Summary 

We've covered a lot of territory already, so let's review where we've been. 

We've seen how using the technique of data abstraction can lead to more reliable, more readable, and 
more flexible programs, and we've introduced many of the features of C++ that help us practice data 
abstraction: 

■ classes, the basic language construct for defining new abstract data types; 

■ member variables, which describe foe data in an abstract dass, and member functions, which define 
the operations on an abstract dass; 

■ encapsulation, which lets us restrict access to certain member variables and functions; 

■ function argument type checking, which helps to ensure that functions are called with proper argu¬ 
ments; 

■ function name overloading, which reduces the need for using unusual function names and helps to 
generalize code; 

■ constructors and destructors, which manage the storage for an abstract data type and guarantee 
that instances of an abstract data type are initialized and finalized; 

■ user-defined implicit type conversion, to let us blend our abstract data types with others and with 
foe fundamental data types of foe language; and, 

■ operator overloading, to let us give additional meaning to most of the existing operators when 
used with our own abstract data types, making our new data types easier to use. 

We've also introduced the idea of breaking up an abstract data type into its specification, which con¬ 
tains the information that foe user, or client, needs to know to use the abstract data type, and its 
implementation, which hides the details of how foe abstract data type works so that it may be pro¬ 
grammed independently by a member of a programming team and be easily maintained. 
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We've just taken a detailed look at the specification of our Biglnt abstract data type Now it's time to 
discuss its implementation. 

As we said earlier, the implementation of an abstract data type consists of die C++ code that 
the details of haw the data abstraction works. For our example it is kept in a separate file named 
Biglntc 

The implementation requires the information kept in the specification, so the first Ifrie in Biglntc is: 
♦Include "Biglnt.h" 

Knee both the implementation and client programs are compiled with the same spe cific ation, die C++ 
compiler ensures a consistent interface between diem. 


The BigInt(const char*) Constructor 

Class Biglnt has three constructors, one to create an instance of a Biglnt from a character string of 
digits (a char*), one to create an instance from an integer (an int), and one to initialize one Biglnt from 
another. We need to be able to create a Biglnt from a string of digits because this is die only way we 
can legally write very large integer constants in C++. Creating a Biglnt from an int is provided as a 
convenience, so we can write small integers in the usual way. 

Here is the implementation of the first constructor 

Biglnt::Biglnt(const char* digitstring) 

{ 

int n - strlen(digitstring); 
if (n !- 0) { 

digits - new char [ndigits-m] ; 
char* p - digits; 
const char* q - SdigitString[n] ; 
while (n—) *p++ - *—q - '0' ; 

) 

eloe { // enpty string 

digits “ new char[ndigita-1]; 
digits[0] - 0; 

) 

) 


This constructor initializes the data structure of a Biglnt as we described previously. We determine 
the length of the character string argument, allocate enough memory to hold the digits of the number, 
then scan die character string from right to left converting each digit character to its binary represen¬ 
tation. 

If the character string is empty we treat this as a special case and create a Biglnt initialized to zero. 
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C programmers will find this code quite recognizable, with a few exceptions that we'll explain in the 
next few sections. 


The Scope Resolution Operator 

The notation BiglntsBiglnt identifies Biglnt as a member function of class Biglnt We mentioned ear¬ 
lier that several C++ rfawK can have member functions with the same names. When it is necessary to 
specify exactly zohich class member we're dealing with, we can prefix the member name by the class 
name and the s operator. The s operator is known as the scope resolution operator, and it may be 
applied to both member functions and member variables. 


Constant Types 

C programmers will be familiar with use of the type char* for arguments that are character strings, but 
what is a coas t char*? In C++, the keyword const can be used before a type to indicate that the vari¬ 
able being declar ed is constant, and therefore may not appear to the left of die assignment (*) opera¬ 
tor. Whem used in an argument list as it is above, it prevents the argument from being modified by 
the function. This protects against another kind of common programming error. 


Member Variable References 

Throughout the body of die member function, you'll notice that we are able to reference the member 
variables of the instance for which die function is called without using the. or -> operators, as we did 
for example in the statement: 

digits - new char[ndigits**n] ; 

Since member functions reference the member variables of their class frequently, this provides a con¬ 
venient, short notation. 


The new Operator 


We used the C++ new operator to allocate the dynamic storage needed to hold die digits of a Biglnt 
In C, we would call the standard C library function mallocO to do this. The new operator has two 
advantages, however. Fust, it returns a pointer of die appropriate data type. Thus, to allocate space 
for the member variables of a struct Biglnt in C we would write: 

(struct Biglnt*)malloc(sizeof(struct Biglnt)) 
whereas in C++ we can write: 
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new Biglnt 


The second advantage is that if we use new to allocate an instance of a class having a constructor func¬ 
tion (such as Biglnt), the constructor is called automatically to initialize the newly allocated instance. 
The result is more readable, less error-prone code. 


Placement of Declarations 


C programmers may have noticed that die declaration of p seems to be "misplaced": 

If (n !- 0) { 

digits - new char [ndigit^m]; // a statement 

char* p “ digits; // a declaration! 

since it appears after the first statement in a block. In C++, declarations may be intermixed with state¬ 
ments as long as each variable is declared before its first use. You can frequently improve die reada¬ 
bility of a program by placing variable declarations near die {dace where they are used. 


The Biglntttnt) Constructor 

Here's the implementation of the Biglntiint) constructor, which creates a Biglnt from an integer 

Biglnt:: Biglnt (tnfc n) 

{ 


char d[3*sizeof (lnt)+l]; 

// buffer for decimal digits 

char* dp ~ d; 

// pointer to next decimal digit 

□digits - 0; 


do { 

// convert integer to decimal digits 

*dp++ - n%10; 


n /- 10; 


ndigits++; 


) while (n > 0); 


digits “ new char [ndlgits]; 



register int i; 

for (i-0; Kndigits; i++) digits [i] - d[i] ; 


This constructor works by converting die integer argument to decimal digits in the temporary array d. 
We then know how much space to allocate for die Biglnt, so we allocate the correct amount of 
dynamic storage using the new operator, and copy the decimal digits from the temporary array into it 
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The Initialization Constructor 


The job of the initialization constructor is to copy the value of its Biglnt argument into a new instance 
of Biglnt: 

void Biglnt::Biglnt(const Biglnt4 n) 

{ 

int i - n.ndigits; 

digits “ new char [ndigityi] ; 

char* p - digits; 

char* q - n.digits; 

while (i—) *p++ - *q++; 


This function makes use of a reference, an important C++ feature we haven't seen before. 


References 

The argument type of the member function Biglntfconst Biglntfe) is an example of a C++ reference. 
Re f erence s address a serious deficiency of G the lade of a way to pass function arguments by refer¬ 
ence. 

To understand what this means, suppose we wish to write a function named inc() that adds one to its 
argument If we wrote this in C as: 

void inc(x) 
int x; 

{ 

x++; 

) 

and then called inri) with the following program: 

int y - 1; 
inc(y); 

printf ("%d\n",y) ; 

we would discover that the program would print a 1, not a 2. This is because in C the value of y is 
copied into the argument x, and the statement x++ increments this copy, leaving the value of y 
unchanged. This treatment of function arguments is known as cuff by value. 

To do this correctly in C we must explicitly pass a pointer as the argument to inc(): 
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void inc(x) 
int* x; 

{ 

*x++; 

} 

int y - 1; 
inc(ty) ; 

printf <"%d\n*,y); 

Notice that we had to change die program in three ways: 

■ the type of the function argument was changed from an int to an int*; 

■ each occurrence of toe argument in the body of die function was changed from x to *k and, 

■ each call of die function was changed from indy) to incMcy). 

The point is that passing a pointer as a function argument requires consistency in every usage of the 
argument within the function body and, worse yet, in every caU of the function made by client pro¬ 
grams. This, combined with Cs lack of function argument type checking, results in ample opportunity 
for error. 

Using a C++ reference, we can write die function incO as follows: 

void inc(int& x) 

( 

x++; 

1 

int y - 1; 
inc(y) ; 

printf ("%d\n",y) ; 

This requires changing only the argument type from int to 

In the function incO, we need to pass die argument * using a reference because its value is modified 
by the function. But efficiency is another reason for passing arguments by reference. When the value 
of an argument requires a lot of storage, as in the case of Biglnts, it is less expensive to pass a pointer 
to the argument even though its value is not to be changed. That's why we declared die argument to 
Biglnt as const BiglntAc — die reference BiglntAc causes just a pointer to die argument to be passed, 
but the const prevents that pointer from being used to change the argument's value from within die 
function. 


The Addition Operator 

Let's take a look at a first draft of the function operator+, which implements Biglnt addition: 
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Biglnt Biglnt:: operator* (const Biglntt n) 

{ 

// Calculate maximum possible number of digits in sum 

int maxDigits - (ndigits>n.ndigits ? ndigits : n.ndigits)+1; 
char* swPtr - new c har [maxDigits]; // allocate storage for sum 

Biglnt sun (sumPtr, maxDigits); // must define this constructor 

int i - maxDigits; 
int carry - 0; 
while (i—) { 

‘sumPtr - /‘next digit of this*/ + /*next digit of n*/ + carry; 
if (‘suoPtr > 9) { 
carry - 1; 

‘sumPtr — 10; 

> 

else carry — 0; 
sumPtr++; 

> 

return sun; 

) 


We add two Biglnts by using the paper-and-pendl method we all learned in grammar school: we add 
the digits of each operand from right to left, beginning with the rightmost, and also add a possible 
carry in from the previous column. If the sum is greater than nine, we subtract ten from the result 
and produce a carry. 


The Biglnt(char*4nt) constructor 


We ran into a couple of problems when writing the addition function which we indicated with com¬ 
ments m the code. The first problem is that we need to declare an instance of Biglnt named sum in 
which to place the result of the addition, which will be left in the array pointed to by sumPtr. We 
must use a constructor to create this instance of Biglnt, but none of those we have defined thus far are 
suitable, so we must write another. 

This new constructor takes a pointer to an array containing the digits and the number of digits in the 
array as arguments and creates a Biglnt from them. We don't want our client programs to use such 
an unsafe and implementation-dependent function, so well declare it in the private part of class Biglnt 
where it can only be used by member functions. Thus, we add die dedaration: 

Biglnt (char*, int); 

just before foe keyword public in foe dedaration of class Biglnt in foe file Biglnth, and we add foe 
implementation of this constructor to foe file Biglntc 
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Biglnt:: Biglnt {char* d, int n) 
{ 

digits - d; 
ndigits - n; 

) 


Class DigitStream 

The second problem we encountered is that scanning the digits of the operands in the statement 

*sunf> - /‘next digit of this*/ + /*nert digit of n*/ + carry; 

becomes complicated because one of die operands may contain fewer digits than die other, in which 
case we must pad it to die left with zeros. We would also face this problem when implementing 
Biglnt subtraction, multiplication, and division, so it is worthwhile to find a dean solution. Let's use 
an abstract data type! 

Here is the declaration for dass DigitStream and die implementation of its member functions: 
class DigitStream { 

char* dp; // pointer to cu r r e n t digit 

int nd; // number of digits remaining 

public: 

DigitStream (ooost Biglnts n); // constructor 

int operator** (); // return current digit and advanoe 

1 ; 


DigitStream::DigitStream (Biglnt* n) 
{ 

dp ■» n.digits; 
nd “ n.ndigits; 


int DigitStream:: operator** () 

{ 

if (nd — 0) return 0; 
else ( 

nd—; 

return *dp*+; 

) 

1 

We can now declare an instance of a DigitStream for each of the operands and use the ** operator 
when we need to read die next digit. 

With these two problems solved, the implementation of the Biglnt addition operator looks like: 
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Biglnt Biglnt::operator+(const Biglnt* n) 

{ 

lnt maxDigits - (ndiglts>n.ndigits ? ndigits : n.ndigits)+1; 
char* sunPtr - new char[maxDigits]; 

Biglnt sum (sumPtr, maxDigits) ; 

DigitStream a (*this); 

DigitStream b(n); 

1st 1 - maxDigits; 
lnt carry ■ 0; 
while <i—) { 

*s\anPtr - (a++) + (b++) + carry; 
if (*sumPtr > 9) { 
carry - 1; 

*sumPtr — 10; 

) 

else carry - 0; 


1 

return stxn; 


Friend Functions 


Our abstract data type DigitStream looks quite elegant, but you may be wondering how the construc¬ 
tor DigitStreamiconst Biglntfe) is able to access the member variables digits and of class 

Biglnt After all, digits and ndigits are private, and DigitStream(const Biglnt*) is not a member 
function of class Biglnt 

Well, it can't We need a way to grant access to these variables to just this one function. C++pro¬ 
vides us with a way to do this — we can make this constructor a friend of dass B iglnt by adding the 
declaration: ^ 


friend DigitStream:: DigitStream (oonst Biglnt*) ; 
to the declaration of Hass Biglnt 

We can also make dU of the member functions of one class friends of another by dedaring the entire 
dass as a friend. For example, we can make off of the member functions of Ha« DigitStr eam friends 
of dass Biglnt by placing the declaration: 

friend DigitStream; 

in the declaration of dass Biglnt 
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The Keyword this 


Going back to the implementation of the function opera tor+O, you may be wondering where the 
pointer variable this came from in the declaration: 

DigitStream a(*this); 


Previously, we described how within the body of a member function we could refer to the members of 
the instance for which the function was called without using the. or -> operators. C++ also gives us 
the keyword this so that we may refer to the entire instance as a unit The keyword this is essentially 
a pointer to this instance, and in our example may be thought of as a variable of type Biglnt*. Thus, 
the declaration DigitStream a(*this) creates an instance of DigitStream for the left operand of opera* 


The Member Function BiglnteprintO 

The implementation of the member function printO is straightforward: 

void Biglnt: :print() 

( 

int i; 

for (i - ndigits-1; i >- 0; i—) prinfcf (•%d-,digits[i]) ; 

> 

It loops through the digits array from the most significant through the least significant Higita railing 
the standard C library function printfO to print each digit. 


The Biglnt Destructor 

The only filing that the Biglnt destructor function "BiglntO must do is free the dynamic storage allo¬ 
cated by the constructors: 

Biglnt::-Biglnt () 

{ 

delete digits; 

) 

This is done using file C++ delete operator, which in this case frees the dynamic storage that is 
pointed to by digits. The delete operator does what is usually accomplished in C by railing the stan¬ 
dard C library function free, but in addition, if we use delete to deallocate an instance of a rla«c hav¬ 
ing a destructor function, the destructor is called automatically to finalize the instance just before its 
storage is heed. The delete operator is thus the inverse of the new operator. 
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Inline Functions 

By now you may be thinking that the overhead of calling all of these little member functions must 
make C++ inefficient. This would be unacceptable for a proper successor to C, which is renowned for 
its efficiency! So C++ allows us to declare a function to be inline, in which case each call of the func¬ 
tion is replaced by a copy of the entire function, much like the substitution performed for the #define 
preprocessor command. This entirely eliminates the overhead of calling a function, and makes encap¬ 
sulation practical. 

To make a function such as "BiglntO inline, we must move its implementation from foe file BiglnLc to 
die file BiglnLh and add the keyword inline to die function definition: 

inline Biglnt::-Biglnt() 

( 

delete digits; 

> 

The function definition must be in BiglnLh because it will be needed by the compiler whenever a 
client program uses a Biglnt 

Small functions make the best candidates for inline compilation. C++ gives us a convenient shorthand 
for writing inline functions: we can indude the function body in the function declaration within the 
class declaration. Thus, we can also make “BiglntO inline by writing: 

-Biglnt() { delete digits; } 

in the declaration of class Biglnt 

Here is a complete version of Biglnth showing a p propr iate functions made inline 

♦include <stdio.h> 

class Biglnt { 

char* digits; 
lnt ndigits; 

Biglnt (char* d, lnt n) { 
digits — d; 
ndigits - n; 

) 

friend DigitStream; 
public: 

Biglnt(const char*); 

Biglnt (int); 

Biglnt(const Biglnts); 

Biglnt operator*(const Biglnt*); 
void print (); 

-Biglnt() ( delete digits; } 

); 

class DigitStream ( 
char* dp; 
int nd; 


// pointer to digit array in free store 
// number of digits 
// constructor function 


// constructor function 
// constructor function 
// initialization constructor function 
// addition oper a tor function 
// printing function 
// destructor function 


// pointer to current digit 
// number of digits remaining 
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public: 

DigitStream (const Biglntt n) ( // constructor function 

dp - n.digits; 
nd - n.ndigits; 

> 

int operator++() ( // return current digit and advance 

if (nd — 0) return 0; 
else ( 

nd—; 

return *dp++; 

1 

} 

>; 


Summary 

This completes our example abstract data type Biglnt Let's review the C++ features presented in this 
section: 

■ the scope resolution operator, which allows us to specify which class we mean when one or more 
classes have member variables or functions with the same name; 

■ constant types, which we can use to protect variables or function arguments from unintended 
modification; 

■ implicit member variable references and the keyword this, which are used within member functions 
to access the instance for which the function is called; 

■ the new and delete operators, which manage the free storage area and call dass 
constructors/ destructors if prese nt; 

■ references, which we can use to conveniently pass pointers to instances instead of the instances 
themselves as function arguments; 

■ friend functions, which give us a way to grant access to the private member variables and func¬ 
tions of a class to other functions and classes; and, 

■ inline functions, which make data abstraction in C++ efficient and practical 
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Our Bi 8 fat data type is an obvious application for the technique of data abstraction because it 

is a numenc dateprpe, like mt and it is natural to extend the meanings of C++'s arithmetic operators 
to apply to Biglnts. As you become more familiar with this technique, you'll discover many <££r- 
tumties for using abstract data types in your programs. Here are a few examples: PP ^ 


Dynamic Character Strings 

We can define a dynamic (he., variable length) character string abstract data type that works like the 
string variables in languages such as BASIC We can overload the operators & and to concatenate 
characte stimgs, overload the relational operators and so on to compare character shines 

and overload the array subscript operator [ ] to address the individual characters of a strine The 
function call operator 

operator () (int position, int length) 
can be overloaded to perform substring extraction and replacement 


Complex Numbers 


a buJt * in complex data type, but it's easy to define one in C++. In feet, one 
is distributed with the C++ compiler. Class complex has two member variables of type double that 
hold the real and imaginaiy parts of a complex number, and an of the usual arithmetic operators are 
overloaded to perform complex arithmetic when applied to instances of <*wipfry , Many of the 
functions m the math library, such as cosO and sqrtO, are overloaded for complex arguments. 


Vectors 


Vectors are another useful abstract data type. We can define dasses for vectors of the fundamental 
data type, such as FloatVec, DoubleVec, and IntVec, and overload the arithmetic operators to apply 
element-by-element to vectors. The array subscript operator [1 can be overloaded to check the ranee 
of vector subscripts or to handle vectors with arbitrary subscript bounds. It's also possible taLOverload 
tire function call operator 0 to subscript multi-dimensional arrays. 


Stream I/O 


A stream I/O package is distributed with the C++ compiler that defines the dass iostream 
(input/output stream) for doing formatted I/O. This dass defines an instance named cin connected to 
the standard input file and overloads the operator » for all the fundamental data types so we can 
write: 
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float x; 
int i; 
char* s; 

cin » x » 1 » s; 

to read a float and Int and a character string from the standard Input file, for example. The advan¬ 
tage of this over using the C library function scanfO is that it is not possible to make die following 
types of errors: 

int 1; 

scanf (*%f",4i); // float font for Int 

scanf ("%d*,i); // int instead of int* 


Similarly, class iostream defines an instance named coat connected to the standard output file and an 
instance named ceir connected to the standard error file. It overloads foe o per a tor « for all the fun¬ 
damental data types so we can write: 

cout « x « i « s; 

to write a float and int end a character string to die standard output file. 

We can also add our own overloadings for the operators » and « for classes we've written so we can 
read or write instances of these classes using the same notation. 
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Perhaps the most interesting features of C++ are those that support the style of programming known 
as object-oriented programming. Object-oriented programming is generally useful, but is particularly 
suited for interactive graphics, simulation, and systems programming applications. 


Derived Classes 


Suppose we have written a C++ class defining an abstract data type, and we need another abstract 
data type that is similar to it Perhaps it requires some additional member variables or functions, or a 
few of its member functions must do something differently. We'd like to reuse the code we've already 
written and debugged as much as possible. C++ gives us a simple way to accomplish this: we can 
declare the new class as a derived doss of our existing class, called the base doss The derived dass 
inherits all of the m ember variables and functions of its base class. We can then differentiate the 
derived class from its base class by adding member variables, adding member functions, or re-defining 
member functions inherited from die base class. 

A base class may have more than one derived class, and a derived class may, in turn, serve as the base 
class for other derived classes. Thus, we can define an entire tree-structured arrangement of related 
classes. This gives us a coherent way to organize classes and to share common code among them. 


Virtual Functions 



Now suppose we're writing a graphics package, and we've written some classes for various geometric 
shapes, such as Line, Triangle, Rectangle, and Circle All of these classes implement some of foe 
same member functions, for example drawO and moveO. The relevant class declarations for riacc Line 
and class Circle would look like this: 

class Line { 


int xl,yl,x2,y2; 


// end point coordinates 


public: 


Line (int 


zzl,int yyl,int xx2,int yy2) // constructor 
xl-xxl; yl-yyl; x2-xx2; y2ryy2; > 


void draw() ; 

void move (int dx, int dy) ; 


// draw a line from (xl,yl) to (x2,y2) 
// move line by amount dx,dy 


}; 


class Circle { 
int x,y; 
int r; 


// center of circle 
// radius of circle 


public: 

Circle (int xx,int yy,int rr) 


// constructor 


x-xr; y-yy; x-rr; > 


void draw() ; 

void move (int dx, int dy); 


// draw circle with oenter (x,y) and radius 
// move circle by a mo unt dx,dy 


>; 
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There are a couple of things we'd like to be able to do with these related classes. First, it would be 
useful to have an abstract data type called Picture that would be a collection of Lines, Triangles, Rec¬ 
tangles, and Circles. Second, we'd like to be able to drawO and moveO our Pictures. 

It would be most elegant if class Picture were general, and contained no mention of die specific 
shapes. That way, we could introduce a new shape, say a Pentagon, and not have to change dass Pic¬ 
ture in any way. 

We can do this by defining a base dass Shape with derived classes Line, Triangle, and so on, as 
shown in Figure 2-6. 


Figure 2-6: Organization of Ctaaa a a for a Graphics Package 



Qa« Shape declares functions applicable to any kind of shape such as drawO and moveO as virtual 
functions, and implements these functions to write out an error message if called: 

class Shape { 
public: 

virtual void drawO; // Shape::drawO prints error message 

virtual void move(int dz, int dy); // Shape: :move() prints error message 

1 ; 


We change the declarations of Hasses Line, Triangle, and so on to be derived from dass Shape by 
adding the name of die base to the declaration of die derived dass; for example: 

class Line : public Shape { ... 

class Circle : public Shape { ... 

and we also add the keyword virtual to die dedarations of the functions drawO and moveO in die 
derived Hasses We don't have to change the implementation of these functions, however. 

Now we can write dass Picture to deal only with Shapes. We can represent a Picture by an array 
containing pointers to its component Shapes, and we can implement PicturesdrawO, for example, sim¬ 
ply by calling ShapesdrawO for each shape in the picture: 
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const 1st PICTUREjCAPACITY - 100; 
class Picture { 


// max nvnber of shapes in picture 



Shape* S[PICTUREjCAPACITY]; 
int n; 


// array of pointers to shapes 
// current number of shapes in picture 


public: 


Picture () ( n - 0; J 
void add(const Shapes); 
void draw(); 

void move (int dx, int dy) ; 


// draw picture 
// move picture 


// constructor 


// add shape to picture 


); 


void Picture::add(const Shapes t) // add a shape to a picture 


if (n — PICTURE CAPACITY) { 


cerr « "Picture c ap ac i ty exoeeded\n"; 
exit(1); 


s[n++] - St; 


// add pointer to shape to picture 


} 


void Picture: :drav() 
( 


// draw a picture 


int 1; 

for (i«0; id; i++) s[i]->draw() ; 



Since ShapeudrawO is a virtual function, C++ takes care of figuring out the specific class of each com¬ 
ponent Shape when the progr am is executed and calling die appropriate implementation of dxawO for 
that class. This is called dynamic binding. 

If we mistakenly forget to implement drawO for a derived class of Shape, it will inherit the implemen¬ 
tation of draw() from class Shape. When we try to draw that shape. Shaperdraw() will be exe cute d , 
which issues an error message, as you'll recall. 

Going a step further, we might want to be able to build a more complicated picture out of a number of 
simpler pictures. We can do this by thinking of a Picture as just another type of Shape, and making it 
another derived class of class Shape, leading to foe class structure shown in Figure 2-7. 
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Figure 2-7: Improved Organization of Ctasaes for a Graphics Package 



Class Libraries 


Taking this technique to its extreme, we can define a class named, say. Object and derive axry class 
from it, either diretfly or indirectly. In class Object we can declare virtual functions that apply to all 
classes — functions for copying, printing, storing, reading, and comparing objects, for example. We 
then can define general data structures comprised of Objects and functions that operate on them that 
will be useful for all classes, just as class Picture could work with any derived class of Shape. 

The author has written a library of about 40 general-purpose cl a s ses, modeled after the basic classes of 
the Smalltalk-80 programming language. The library, known as the Object-Oriented Program Support 
(OOPS) class library, contains classes such as String, Date, Time, Set (hash tables). Dictionary (associa¬ 
tive arrays), and LinkedList 

Writing C++ programs using a class library such as this is a real delight The classes are general- 
purpose, and most programs of any size will have uses for some of them. They are flexible — if a par¬ 
ticular class doesn't quite do what is needed it's usually a simple matter to derive a class that does. 
And the library is extensible. It provides a framework that makes it easy to add your own custom 
classes and make them function along with existing ones. 

As an example, let's see how the OOPS class library can help us with foe graphics package we've been 
discussing. The OOPS library has a class Point for representing x-y coordinates. We can use it in 
graphics classes such as Line: 
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class Line : public Shape { 

Point a,b; 

public: 

Line(Point pi. Point p2) 

void draw(); 

void move (Point delta) ; 

}; 


// endpoints of the line 

{ a^>l; b-p2; } // constructor 

// draw a line from point a to point b 
// move line by delta 


Many of the arithmetic operat o rs are defined by class Point, so we can implement moveO, for exam¬ 
ple, by writing: 

void Line: :nove(Point delta) 

{ 

a +» delta; b +- delta; 

1 


Our crude implementation of dass Picture allocated an array of fixed size to hold the pointers to its 
component shapes. We can use the OOPS library dass OrderedCltn to make this a variable-length 
array. An OrderedCltn is an array of pointers to Objects, so we can use it to hold pointers to 
instances of any dass derived from Object, just as we used an array of pointers to Shapes to hold 
pointers to Lines, Triangles, and so on. To make dass Shape a derived class of Object we modify its 
declaration: 


class Shape : public Object { ... 


Now we can write dass Picture as: 


// collectio n of pointers to shapes 

// constructor 
// add shape to picture 
// draw picture 
// nova picture 


class Picture : p ub li c Shape ( 
OrderedCltn s; 

public: 

Picture () O 

virtual void add (const Shapes) ; 

virtual void draw() ; 

virtual void nova (Point delta); 

); 


Class OrderedCltn defines member functions such as addO, removeO, sizeO, first(), and lastO to let us 
manipulate the pointers in the array. It also overloads die subscript operator [ ] so we can subscript 
OrderedCltns like arrays. Using these we can write die functions PicturesaddO and Picturesdraw as 
follows: 
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void Picture::add(const Shapes t) 


// add a shape to a picture 


s.add(t); 


// this calls OrderedCltn::add() 


void Picture::draw() 


// draw a picture 


int i; 

for (i-0; i<s.size(); i++) // 

( (Shape *)s [i]) -xiraw (); // 

// 


s.sizeO returns # of objects in s 
cast address of i th 
to Shape* and call draw() 


Now Pictures can have as many shapes in them as we need; class OrderedCltn manages the required 
storage for us. 


Object I/O 


Let's write a program that uses our graphics cl a sses to create a simple picture composed of two shapes 
— a line and a circle r 


main() 


Picture pict; 

pict .add (*new Line (Point (0,0), Point (10,10))) ; 
pict .add (*new Circle (Point (10,10), 2)) ; 
pict. draw () ; 


The first statement in the body of mainO declares an instance of class Picture named pict the second 
statement constructs an instance of Line with endpoints at (0,0) and (10,10) and adds it to pict and the 
third statement constructs an instance of Circle with the center at (10,10) and radius 2 and also adds it 
to pict The result is the data structure shown in Figure 2-8. 
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Figure 2*®: The date structure of a eimple picture. Instances of OOPS library classes are shown as dashed 

r*ctanglM. 


Line 



Picture. 



Circle 

Point center 
i r inf xc~ = To 1 
' int yc = 10 • 

int r = 2 


What if we wanted to save this data structure on a disk file so it could be read in later and used by 
another program? The OOPS class library makes this simple. We create an output stream (an instance 
of class fstream) named, for example, out, and write the picture to it with the statements: 

♦include <iostream.h> // files for 

♦include <fstream. h> // standard C++ stream I/O 

// ... 

fstream out("picturefile",output); // create "picturefile" 

pict.storeOn(out); 


The function stoxeOnO, which is implemented in class Object handles the details of finding all of the 
objects in the picture data structure and writing them to the output stream in a program-independent, 
machine-independent format The storeOnO function calls the virtual function stored) to actually 
write out member variables. The stored) function is declared in Object and is reimplemented by 
each derived class to write out its own member variables. This function is already implemented for all 
of the OOPS library classes, but we must write one for any classes of our own which we've derived 
from class Object That's easy to do. For example, the stored) function for class Picture looks like: 
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void Picture::storer(iostreantf stem) 
{ 


J 


Shape:: storer (stun); 
s. storeOn (strut) ; 


// store members of base class, if any 
// store member of class Picture 


To read a picture from a file, we create an input stream, in, (an instance of rfass fstream) connected to 
the file we wish to read, and read the picture from it with the statements: 

♦include <iostream.h> // include for 

♦include <fstream. h> // standard C++ stream I/O 

// ... 

fstream In ("picturefile", input); // open "picturefile" read-only 
readFromdn, "Picture", pict) ; 

The second argument tells readFromO that we're expecting an instance of Picture to be read, and 
to complain if the next object on the input stream is of any other 

The function readFromO works somewhat like storeOnO, calling a small "reader" function which we 
must write for each of our classes. 

We can use OOPS object I/O to store and read an arbitrarily complex data structure containing 
instances of both OOPS library classes and our own classes. Since the data structure is converted into 
a program-independent, machine-independent format, we can send it through a UNIX pipe to another 
process running on the same machine, or over a network to another process running on a different 
kind of machine. This capability is particularly useful for spread sheets, forms, documents, drawings, 
electronic mail, and so on. The OOPS class library also gives us a framework to use when implement¬ 
ing object I/O for our own classes. We don't have to spend time designing a storage format or worry 
about such issues as what to do with the pointers in a data structure, fin* example. We can use die 
general-purpose mechanism provided by the OOPS class library, and concentrate on our particular 
application. 
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The Current Status of C++ 


The C++ programming language is currently implemented as a translator, which accepts C++ source 
code as input and produces C source code as output. The C++ translator and run-time support library 
are written in C++, making them easily portable to most UNIX systems. 

AT&T first made the C++ translator available to universities and non-profit organizations in December, 
1984. Release 1.0 became commercially available as an unsupported product in October, 1985. 

The AT&T C++ Language System can run on any UNIX machine capable of running programs up to 
about 500KB in size, and having a robust C compilation system that can handle variable and external 
symbol names of arbitrary length. The C compiler must also allow structure assignments and the use 
of structures as function arguments and return values. 

Training and third-party su ppo rted ports of the AT&T C++ Translator can be obtained for various 
UNIX systems, VAX VMS, MS-DOS, and others. 
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The Future of C++ 


The definition of the C++ programming language is not yet final. When the ANSI C standard is com¬ 
pleted, C++ will undoubtably be revised to eliminate any unnecessary incompatibilities; for example, 
the ANSI C rules for doing floating point arithmetic will be adopted. Historically, C++ has met the 
challenge of evolving while remaining compatible with C and earlier versions of C++. 

Will the C++ programming language be as successful as its predecessor, or will it become just another 
of the countless languages that never achieve widespread use? Well, C++ has a lot going for it 

■ Since C++ is, with a few minor exceptions, a superset of C, it has no fatal deficiencies. It also 
possesses those attributes of C that have contributed to Cs success: portability, flexibility, and 
efficiency. 

■ C++ is less error-prone than C It thoroughly type-checks programs, as is the trend in modem 
programming languages, but not at the expense of flexibility or convenience. A progranurcr 
may coerce (cast) types when necessary, and define his or her own implicit type conversions for 
convenience. 

■ Support for data abstraction and object-oriented programming make C++ a much more powerful 
and expressive language than C Yet the language remains one of manageable size, much 
smaller than PL/1 or ADA, for example. 

■ C++ programs are compatible with UNIX and with the large number of existing C libraries for 
graphics, database management, math, and statistics. 

■ There is a large existing community of C programmers who can begin to use C++ immediately, 
gradually learning and utilizing its new features. 

■ The AT&T C++ Language System is commercially available in source form, is inexpensive, and 
is highly portable. It makes the language accessible on almost all popular operating systems. 

■ AT&T is developing a portable C++ compiler, which will compile C++ progra ms more quickly 
than the combination of the C++ Translator and C compiler now required. 

■ C++ was designed at the AT&T Bell Laboratories Computer Science Research Center in Murray 
Hill. They have an i mp re ss ive track record in producing successful software, such as the UNIX 
system and C language. 

The main obstacle to the widespread adoption of C++ is that to realize its benefits one must master the 
techniques of data abstraction and/or object-oriented programming — techniques that are unfamiliar 
to the current generation of programmers. When this educational problem is solved, C++ should 
succeed C as the language of choice for a wide range of applications. 
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Footnotes 


1. This paper fits the description in the US. Copyright Act of a "United States Government work." 
It was written as a part of the author's official duties as a Government employee. This means it 
cannot be copyrighted. This paper is freely available to the public for use without a copyright 
notice, and there are no restrictions on its use, now or subsequently. 

The author's time and the computer facilities required to prepare this paper were provided by 
the Computer Systems Laboratory, Division of Computer Research and Technology, National 
Institutes of Health. 

2. Binary operators such as + are usually not defined as member functions because automatic 
conversion of types is not done for fire left operand. For example, the expression a + 47 is 
equivalent to a.operator+<47). C++ recognizes that the function operator+<const Bigin t&) is 
defined and that the constructor Biglntiint) can be used to convert the int 47 to a Biglnt before 
calling operator+. However, the expression 47 + a is equivalent to 47.operator+<a), which is an 
error because 47 is not an instance of a class and therefore has no member functions that can be 
applied to it For this reason, binary operators are usually defined as friend functions, which are 
discussed later. 
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An Overview of C++ 


NOTE 


This chapter is taken directly from a paper by Bjarne Stroustrup. 


Introduction 


C++ is a general purpose programming language designed to make programming more enjoyable for 
the serious programmer. Except for minor details, C++ is a superset of the C language. C++ was 
designed to 


■ be a better C 

■ support data abstraction 

■ support object-oriented programming 

This paper describes the features added to C to achieve this. In addition to C, die main influences on 
the design of C++ were Simula67 and Algol68. 

C++ has been in use for about four years and has been applied to most brandies of systems program¬ 
ming including compiler construction, data base management, graphics, image processing, music syn¬ 
thesis, networking, numerical software, programming environments, robotics, simulation, and switch¬ 
ing. It has a highly portable implementation and there are now thousands of installations including 
AT&T 3B, DEC VAX, Intel 80286, Motorola 68000, and Amdahl machines running UNIX and other 
operating systems. 


What is Good about C? 

C is dearly not the cleanest language ever designed nor the easiest to use; so why do so many people 
use it? 

■ C is flexible: it is possible to apply C to most every application area, and to use most every pro¬ 
gramming technique with C The language has no inherent limitations that predude particular 
kinds of programs being written. 

a C is efficient die semantics of C are 'low level"; that is, the fundamental concepts of C mirror 
the fundamental concepts of a traditional computer. Consequently, it is relatively easy for a 
compiler and/or a programmer to utilize hardware resources for a C p rog ra m effidendy. 

a C is available: given a computer, whether die tiniest micro or die largest super-computer, the 
chance is that there is an acceptable quality C compiler available and that that C compiler sup¬ 
ports an acceptably complete and standard C language and library. There are also libraries and 
support tools available, so that a programmer rarely needs to design a new system from scratch. 

■ C is portable: a C program is not automatically portable from one machine (and operating sys¬ 
tem) to another nor is such a port necessarily easy to do. It is, however, usually possible and the 
level of difficulty is such that porting even major pieces of software with inherent machine 
dependences is typically technically and economically feasible. 
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Compared with these "first order" advantages, the "second order" drawbacks like the curious C 
declarator syntax and the lack of safety of some language constructs become less important. Designing 
"a better C" implies compensating for the major problems involved in writing, debugging, and main¬ 
taining C programs without compromising the advantages of C. C++ preserves all these advantages and 
compatibility with C at the cost of abandoning claims to perfection and of some compiler and 
language complexity. However, designing a language "from scratch" does not ensure perfection and 
the C++ compilers compare favorably in run-time, have better error detection and reporting, and equal 
the C compilers in code quality. 


A Better C 

The first aim of C++ is to be "a better C" by providing better support for the styles of programming 
for which C is most commonly used. This primarily involves providing features that make the most 
common errors unlikely (since C++ is a superset of C such errors cannot simply be made impossible). 

Argument Type Checking and Coercion 

The most common error in C programs is a mismatch between the type of a function argument and 
the type of the argument expected by the called function. For example: 

double sqrt(a) double a; 

{ 

/* ... V 

) 

double sq2 - sqrt(2); 

Since C does not check die type of the argument 2, the call sqrt(2) will typically cause a run time error 
or give a wrong result when the square root function tries to use the integer 2 as a double precision 
floating point number. In C++, this program will cause no problem since 2 will be converted to a 
floating point number at the point of the call. That is, sqrt(2) is equivalent to sqrt((double)2). 

Where an aiguirent type does not match the argument type specified in the function declaration and 
no type conversion is defined the compiler issues an error message. For example, in C++ sqrtC’Hello") 
causes a compile time error. 

Naturally, the C++ syntax also allows the type of arguments to be specified in function declarations: 
double sqrt(double); 

and a matching function definition syntax is also introduced: 

double sqrt(double d) 

( 

// ... 

> 
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Inline Functions 

Most C programs rely on macros to avoid function call overhead for small frequently-called opera¬ 
tions. Unfortunately the semantics of macros are very different from the semantics of functions so the 
use of macros has many pitfalls. For example: 

#define oul(a,b) a*b 
int x m mil (x*3+2,y/4) ; 


Here z will be wrong since the macro wfll expand to x"3+2*y/4. Furthermore, C macro definitions do 
not follow the syntactic rules of C d ecl a r ations, nor do macro names follow die ||q|i> ' C scop* rules. 
C++ circumvents such problems by allowing the programmer to declare inline functions: 

i n li ne int aul(int a, int b) { return a*b; > 


An inline function has fire same semantics as a "normal" function but the compiler can typically inline 
expand it so that die code-space and run-time efficiency of maaos are achieved. 

Scoped and Typed Constants 

Since C does not have a concept of a symbolic constant macros are used. R>r example 
♦define TBIMAX (TBLSIZE-1) 


Such "constant macros" are neither scoped nor typed and can (if not properly parenthesized) cause 
problems similar to those of other macros. Furthermore, they must be evaluated each time they are 
used and their names are "lost" in the macro expansion phase of die comp i lation and consequently are 
not known to symbolic debuggers and other tools. In C++ constants of any type can be declared: 

const int TBIHAX - TBLSIZE-1; 


Varying Numbers of Arguments 

Functions taking varying numbers of arguments and functions accepting arguments of different types 
are common in C TTiey are a notable source of both convenience and errors. 

C functions where the type of arguments or die number of arguments (but not both) can vary can be 
handled in a simple and type-secure manner in C++. For example, a function taking one, two, or dove 
arguments of known type can be handled by supplying default argument values which die compiler 
uses when the programmer leaves out arguments. For example 

void print (char*, char * ■ char* ■ "—*); 

print ("one", "two", "three"); 
print ("one", "two"); // that is, 

print ("one"); // that is. 


print ("one", "two", "-"); 
print ("one", "-") ; 
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Some C functions take arguments of varying types to provide a common name for functions perform¬ 
ing similar operations on objects of different types. This can be handled in C++ by overloading a func¬ 
tion name. That is, the same name can be used for two functions provided the argument types are 
sufficiently different to enable the compiler to "pick the right one" for each call. For example: 

void print (int); 
void print(char*); 

print (1); // integer print function 

pri nt- ("two"); // string print function 

The most general examples of C functions with varying arguments cannot be handled in a type-secure 
manner. Co nsider the standard output function printf, which takes a format string followed by an 
arbitrary collection of arguments supposedly matching the format string: 1 

printf ("a string”); 
printf ("x - %d\m",x) ; 

printf ("nane: %s\n size: %d\n", obj.name, obj.size); 

However, in C++ one can specify the type of initial arguments and leave the number and type of the 
remaining arguments unspecified. For example, printf and its variants can be declared like this: 

int printf (const char* ...); 

int fprintf(FILE*, const char* ...); 

int sprintf(char*, const char* —); 

These declarations allow the compiler to catch errors such as 

printf (stderr,"x - %d\m",x); // error: printf does not take a FILE* 

fprintf ("x - %d\m",x); // error: fprintf needs a FILE* 


Declarations as Statements 

Uninitialized variables are another common source of errors. One cause of this class of errors is the 
requirement of the C syntax that declarations can occur only at foe beginning of a block (before the 
first statenwnt). In C++, a declaration is considered a kind of statement and can consequently be 
piar»d anywhere. It is often convenient to place foe declaration where it is first needed so that it can 
be initialized immediately. For example: 

void scrae_function (char* p) 

{ 

if (p—0) error ("p—0 in some_function") ; 
int length - strlen (p) ; 

H ... 

) 
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Support for Data Abstraction 


C++ provides support for data abstraction: the programmer can define types that can be used as con¬ 
veniently as built-in types and in a similar manner. Arithmetic types such as rational and complex 
numbers are common examples: 

class canplex { 
double re, lux; 
public: 

complex (double 
conplex (double 

friend canplex 
friend canplex 
friend conplex 
friend ccaplex 
friend canplex 
U ... 

> 

The declaration of class (that is, user-defined type) complex specifies the representation of a complex 
number and the set of operations on a complex number. The representation is private ; that is, re and 
im are accessible only to die functions defined in the declaration of class complex. Such functions can 
be defined like this: 

canplex aperator+(canplex al, ca np lex a2) 

( 

return canplex(al.re+a2.re, al.loH-a2.iiit); 

) 

and used like this: 

nain() 

( 

canplex a — 2.3; 
canplex b - 1/a; 
canplex c * a + b * ca n plex (1,2.3); 

U ... 

} 

Functions H +rlaw pd in a class declaration using the keyword Mend are called friend functions. They do 
not differ from ordinary functions except that they may use private members of cla s s es that name 
them friends. A function can be declared as a friend of more than one class. Other functions declared 
in a Haw declaration are called member functions. A member function is in the scope of fixe class and 
must be invoked for a specific object of that class. 


r, double i) { re—r; in»-i; } 

r) ( re-r; inHD; } // float->canplex conversion 

operator*(canplex, canplex); 
operator- (canplex, canplex); // binary m i nu s 

operator- (canplex); // unary minus 

operator*(canplex, canplex); 
operator/(canplex, canplex); 


An Overview of C++ 


3-5 






An Overview of C++ 


Initialization and Cleanup 

When the representation of a type is hidden some mechanism must be provided for a user to initialize 
variables of that type. A simple solution is to require a user to call some function to initialize a vari¬ 
able before using it This is error prone and inelegant A better solution is to allow the designer of a 
type to provide a distinguished function to do the initialization. Given such a function, allocation and 
initialization of a variable becomes a single operation (often called instantiation) instead of two 
separate operations. Such an initialization function is called a constructor. In cases where construction 
of objects of a type is non-trivial one often needs a complementary operation to dean up objects after 
their last use. In C++ such a cleanup function is called a destructor. Ow yeider a vector type: 

class vector { 
lot sz; 
int* v; 
public: 

vector (lot); 

-vector 0 ; 

U ... 

}; 

The vector constructor can be defined to allocate a suitable amount of sp a c e like this: 

vector::vector (lot s) 

{ 

If (s<X)) error ("bad vector size"); 
sz “ s; 

v - new int [s]; // allocate an array of *s" Integers 

) 

The cleanup done by foe vector destructor consists of freeing foe storage used to store foe vector de¬ 
ments for re-use by foe free store manager 

vector: :~vector() 

{ 

delete v; // deallocate the memory pointed to by v 

1 


// num b er of elements 
// pointer to integers 

// constructor 
// destructor 


C++ does not support garbage collection. This is, however, compensated for by enabling a type to 
maintain its own storage management without requiring intervention horn a user. vector is an 
example of this. 

Free Store Operators 

The operators new and ddete were introduced to provide a standard notation for free store allocation 
and deallocation. A user can provide alternatives to their default implementations by deRmng func¬ 
tions called operator new and operator delete. For built-in types the new and delete opoators pro¬ 
vide only a notational convenience (compared with foe standard C functions mallocO and freeO). For 
user-defined types such as vector foe free store operators ensure that constructors and destructors are 
called property: 
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vector* fctKint n) 

{ 

vector v(n); // allocate a vector on the stack 

// the constructor Is called 

vector* p ■» new vector (n); // allocate a vector on the free store 

// the constructor is called 

U ... 

return p; 

// the destructor is implicitly called for "v" here 

) 


void fct2() 

{ 

vector* pv - fctl (10); 

// ... 

delete pv; // call the destructor and free the store 

) 


References 

C provides (only) "call by value" semantics for function argument passing; "call by reference" can be 
simulated by explicit use of pointers. This is sufficient, and often preferable to using "pass by value" 
for the built-in types of C However, it can be inconvenient for larger objects 2 and can get seriously in 
the way of defi ning conventional notation for user-defined types in C++. Consequently, the concept of 
a reference is introduced. A reference acts as a name for an object; T& means reference to T. A refer¬ 
ence must be initialized and becomes an alternative name for foe object it is initialized with. For 
example: 

int a - 1; // "a" is an integer initialized to ”1" 

intt r - a; //' "r" is a reference i n i tial ized to "a" 

The reference r and foe integer a can now be used in foe same way and with foe same meaning. For 
example: 

int b - r; // "b" is initialized to the value of "r", that is, *1* 
r - 2; // the value of "r", that is, the value of "a" becomes *2" 

References enable variables of types with "large repr ese ntations" to be manipulated efficiently without 
explicit use of pointers. Constant references are particularly useful: 

matrix operator+ (const matrixs a, c on st natrixs b) 

{ 

// rryfr* here cannot modify the value of "a* or *b* 

1 

matrix a - b+c; 

In such cases foe "call by value" semantics are preserved while achieving the efficiency of "call by 
reference." 
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Assignment and Initialization 

Controlling construction and destruction of objects is sufficient for many, but not all, types. It can also 
be necessary to control all copy operations. Consider: 

vector vl(100); // make vl a vector of 100 elements 

vector v2 - vl; // make v2 a copy of vl 

vl - v2; // assign vl to v2 (that is, copy the elements) 

Declaring a function with the name operators in the declaration of class vector specifies that vector 

assignment is to be implemented by that function: 

class vector ( 
int* v; 

1st sz; 
public: 

II ... 

void operator-(vectors); // assignment 

1 ; 

Assignment might be defined like this: 

void vector::operator-(vectors a) // check size and copy eleme n ts 
{ 

if (sz !- a.sz) error ("bad vector size for -"); 
for (int i - 0; i<sz; i++) v[i] - a.v(i]; 

1 

Since the assignment operation relies on the "old value" of the vector assigned to, it cannot be used to 
implement initialization of one vector with another. What is needed is a constructor that takes a vec¬ 
tor argument 

class vector { 

// ... 

vector (int); // create vector 

vector (vectors); // create vector and copy elements 

); 


vector:: vector (vectors a) // initialize a vector from another vector 

( 

sz - a.sz; // same size 

v - new int (sz]; // allocate element array 

for (ini- i - 0; i<sz; i++) v[i] - a.v(i]; // same values 

} 

A constructor like this (of the form X(X&)) is used to handle all initialization. This includes arguments 
passed "by value" and function return values: 
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vector v2 - vl; // use vector (vector*) constructor to initialize 


void f(vector); 
f(v2); 


// use vector (vector*) constructor to pass a copy of v2 


vector g(int sz) 

{ 

vector v(sz); 

^ return v; // use vector (vector*) c on structor to return a copy of v 

Operator Overloading 

As demonstrated above, standard operators like +,-,*,/ can be defined for user-defined types, as can 
assignment and initialization in its various guises. In general, all the standard operators with die 


exception of 


can be overloaded. The subscripting operator (] and the function application operator 0 have proven 
particularly useful. The C "operator assignment" operators, such as +- and *=, have also found many 


It is not possible to redefine an operator when applied to built-in data types, to define new operators 
or to redefine die precedence of operators. 

Coercions 

User-defined coercions, like the one from floating point numbers to complex numbers implied by the 
constructor complex(double), have proven unexpectedly useful in C++. Such coercions can be applied 
explicitly or the programmer can rely on the compiler adding them implicitly where necessary and 
unambiguous: 


complex a - complex (1); 
complex b - 1; 
a - b+ccnplex (2) ; 


// implicit: 1 -> complex(1) 



// implicit: 2 -> complex(2) 
// implicit: 2 -> complex(2) 


Coercions were introduced into C++ because mixed mode arithmetic is the norm in languages used for 
numerical work and because most user-defined types used for "calculation" (for example, matrices, 
character strings, and machine addresses) have natural mappings to and/or from other types. 

Great rare is taken (by the compiler) to apply user-defined conversions only where a unique conver¬ 
sion exists. Ambiguities caused by conversions are compile time errors. 

It is also possible to define a conversion to a type without modifying the declaration of that type. For 
example: Jr 
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class point { 
float dist; 
float angle; 
public: 

U ... 

operator canplezO // convert point to coupler number 
( 

return polar (dist,angle) ; 

} 

operator double () // convert point to real number 

{ 

if (angle) error("ca nn ot convert point to real: angle!-0"); 
return dist; 

) 

}; 

These conversions could be used like this: 

void sorae_function (point a) 

{ 

complex * - a; // x - a.operator coupler () 

double d - a; // d - a.operator doubled 

ccnplex *3 - a+3; // x3 - a.operator cocplex() + coupler (3) ; 

M ... 

) 

This is particularly useful for defining conversions to built-in types since there is no declaration for a 
built-in type for the programmer to modify. It is also essential for defining conversions to "standard" 
user-defined types where a change may have (unintentionally) wide ranging ramifications and where 
the average programmer has no access to the declaration. 


Support for Object-Oriented Programming 


C++ provides support for object-oriented programming: the programmer can define Ha<» hierarchies 
and a call of a member function can depend on the actual type of an object (even where foe actual 
type is unknown at compile time). That is, the mechanism that handles member function calls handles 
the case where it is known at compile time that an object belongs to some class in a hierarchy, but 
exactly which class can only be determined at run time. See examples below. 

Derived Classes 

C++ provides a mechanism for expressing commonality among different types by explicitly defining a 
class to be part of another. This allows re-use of classes without modification of existing classes and 
without replication of code. For example, given a class vector: 
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class vector { 

// ... 
public: 

// ... 

vector(int); 

lntt operator!] (int); // overload the subscripting operator: [] 

) 

one might define a vector for which a user can define the index bounds: 

class vec : public vector { 
int low, high; 
public: 

vec (int, int); 
inti operator!](int); 

>; 

Defining vec as 

: public vector 

means that first of all a vec is a vector. That is, type vec has ('Inherits") all the pr o perties of type vec¬ 
tor in addition to fire ones declared specifically for it Class vector is said to be the base <•!»« for vec, 
and conversely vec is said to be derived from vector. 

Cl a s s vec modifies class vector by providing a different constructor, requiring the user to specify the 
two index bounds rather than the size, and by providing its own access function operatorOO. A vec's 
operator!]!) is easily expressed in terms of vector's operatorOO: 

intt vec::operator!](int i) 

{ 

return vector: .'operatorI] (i-low) ; 

) 

The scope resolution operator c is used to avoid getting caught in an infinite recursion by calling 
vecnoperatorOO from itself. Note that vecsoperatorQO had to use a function like vectorsoperatorOO to 
access elements. It could not just use vector's members v and sz directly since they were declared 
private and therefore accessible only to vector's member functions. 

The constructor for vec can be written like this: 

vec:: vec (int lb, int hb) : vector (hb-lb+1) 

{ 

if (hb-lbcO) hb - lb; 
low - lb; 
high - hb; 

) 

The construct nrector(hb-lb+l) is used to specify the argument list needed for the base rfacc construc¬ 
tor vectori). 
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Class vec can be used like this: 


void soo>e_function (int 1, int h) 

{ 

vec vl(l,h); 

const int sz - h-1+1; 

vector v2(sz); 

II ... 

for (int i-0; i<sz; i++) v2[i] — vl [1+i]; // copy elements explicitly 
v2 - vl; // copy elements by using vector::operator-() 


Virtual Functions 

Class derivation (often called subclassing) is a powerful tool in its own right but a facility for run-time 
type resolution is needed to support object-oriented programming. 

Consider defining a type shape for use in a graphics system. The system has to support circles, trian¬ 
gles, squares, and many other shapes. First specify a class that defines the general pro p erties of all 


class shape ( 
point oenter; 
color col; 

II ... 
public: 

point where() { return oenter; } 

void move (point to) { oenter - to; draw(); } 

virtual void draw() ; 

virtual void rotate (int) ; 

II ... 

}; 


The functions for which the calling interface can be defined, but where the implementation cannot be 
defined except for a specific shape, have been marked virtual (the Simula67 and C++ term for "to be 
defined later in a class derived from this one”). Given this definition one can write g en e ra l functions 
manipulating shapes: 

void rotate_all (shape* v, int size, int angle) 

II rotate all members of vector "v" of size "size" "angle" degrees 
for (int i-0; i < size; i++) v[ij .rotate (angle); 


For each shape vji], the proper route function for the actual type of the object will be called. That 
"actual type" is not known at compile time. 

To define a particular shape we must say that it is a shape (that is, derive it from class shape) and 
specify its particular properties (including the virtual functions): 
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class circle : public shape { 
int radius; 
public: 

void draw() { /* ... */ ); 

void rotate(int) {) // yes, the null function 

); 


In many contexts it is important that the C++ virtual function mechanism is very nearly as efficient as 
a "normal" function call. The additional run-time overhead is about 4 memory references (dependent 
on the machine architecture and the compiler) and the memory overhead is one word per object plus 
one word per virtual function per class. 

Visibility Control 

The basic scheme for separating the (public) user interface from the (private) implementation details 
has worked out very well for data abstraction uses of C++. It matches the idea that a type is a blade 
box. It has proven to be less than ideal for object-oriented uses. 

The problem is that a dass defined to be part of a class hierarchy is not simply a blade box It is often 
primarily a building block for the design of other classes. In this case die simple Unary choice 
public/private can be constraining. A third alternative is needed: a member should be private as far as 
functions outside the dass hierarchy are concerned but accessible to member functions of a derived 
class in the same way that it is accessible to members of its own dass. Such a member is said to be 
protected. 

For example, consider a dass node for some kind of tree: 

class node ( 

// private stuff 
protected: 

node* left; 
node* right; 

// more protected stuff 
public: 

virtual void print(); 

// more public stuff 

}; 


The pointers left and right are inaccessible to the general user but any member function of a dass 
derived from class node can manipulate the tree without overhead or inconvenience. 

The protection/hiding mechanism applies to names independently of whether a name refers to a func¬ 
tion or a data member. This implies that one can have private and protected function members. Usu¬ 
ally it is good policy to keep data private and present die public and protected interfaces as sets of 
functions. This policy minimizes die effect of changes to a class on its users and consequently maxim¬ 
izes its implementor's freedom to make changes. 

Another refinement of the basic inheritance scheme is that it is possible to inherit public members of a 
base class in such a way that they do not become public members of the derived class. This can be 
used to provide restricted interfaces to standard classes. For example: 
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class dequeue { 

// ... 


void insert (elem*) ; 
void append (elem*) ; 
elem* remove () ; 


Given a dequeue a stack can be defined as a derived class where only the insertO and xemoveO 
tions are defined: 


opera' 


class stack : private dequeue { // note: just not ": public* neuters 

// of dequeue are private members of stack 

public: 

dequeue::insert; // make "insert" a public member of stack 

dequeue::remove; // make "remove" a public member of stack 

}; 

Alternatively, inline functions can be defined to give these operations the conventional names: 

c l ass stack : private dequeue { 
public: 

void pushfelem* ee) { dequeue::insert<ee); 1 
elem* pop() ( return dequeue::remove (); ) 


What is Missing? 


C++ was designed under severe constraints of compatibility, internal 
feature was included that 


consistency, and efficiency: 


no 


■ would cause a serious incompatibility with C at the source or linker levels 

■ would cause run-time or space overheads for a program that did not use it 


■ would increase run-time or space requirements for a C program 

■ would significantly increase the compile time compared with C 


■ amid only be implemented by making requirements of the programming environment (linker, 
loader, etc) that could not be simply and efficiently implemented in a traditional C program- 
nung environment v 6 


Features that might have been provided but weren't because of these criteria include garbage collec¬ 
tion, parameterized classes, exceptions, support for concurrency, and integration of the lanjniage with 
a programming environment. Not all of these possible extensions would actually be appropriate for 
C++ and unless great constraint is exercised when selecting and designing features for a language a 
large, unwieldy, and inefficient mess will result The severe constraints on the design of C++ have 
probably been beneficial and will continue to guide the evolution of C++. 
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Conclusions 

C++ has succeeded in providing greatly improved support for traditional C-style programming 
without added overhead. In addition, C++ provides sufficient language support for data abstraction 
and object-oriented programming in demanding (both in terms of machine utilization and application 
complexity) real-life applications. C++ continues to evolve to meet demands of new application areas. 
There still appears to be ample scope for improvement even given the (self imposed) Draconian criteria 
for compatibility, consistency, and efficiency. However, currently the most active areas of develop¬ 
ment are not the language itself but libraries and support tools in the programming environment 
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Footnotes 


1 . A C++ I/O system that avoids the type insecurity of the printf approach is described in The C++ 
Programming Language. 

2. As indicated by an inconsistency in the C semantics, arrays arc always passed by reference. 
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What is “Object-Oriented Programming”? 


NOTE 


This chapter is taken directly from a paper by Bjarne Stroustrup. 


Abstract 


"Object-Oriented Programming" and "Data Abstraction" have become very common terms. Unfor- 
tunately, few people agree on what they mean. I will offer informal definitions that appear to make 
sense in the context of languages like Ada, C++, Modula-2, Simula, and Smalltalk. The general idea is 
to equate "support for data abstraction" with the ability to define and use new types and equate "sup¬ 
port for object-oriented programming" with the ability to express type hierarchies. Features necessary 
to support these programming styles in a general purpose programming language will be 
The presentation centers around C++ but is not limited to facilities provided by that language. 


Introduction 


Not all programming languages can be "object oriented." Yet claims have been made to the effect that 
APL, Ada, Qu, C++, LOOPS, and Smalltalk are object-oriented p ro gramm ing languages. I have heard 
discussions of object-oriented design in C, Pascal, Modula-2, and CHILL. Could there somewhere be 
proponents of object-oriented Fortran and Cobol programming? I think there must be. "Object- 
oriented" has in many circles become a high-tech synonym for "good," and when you examine discus¬ 
sions in the trade press, you can find arguments that appear to boil down to syllogisms li ke- 


Ada is good 
Object oriented is good 


Ada is object oriented 


This paper presents one view of what "object oriented" ought to mean in the context of a general pur¬ 
pose programming language. 

■ distinguishes "object-oriented programming"' and "data abstraction" from each other and from 
other styles of programming and presents the mechanisms that are essential for supporting the 
various styles of programming 

■ presents features needed to make data abstraction effective 

■ dis c uss es facilities needed to support object-oriented programming 

■ presents some limits imposed on data abstraction and object-oriented programming by tradi¬ 
tional hardware architectures and operating systems 

Examples will be presented in C++. The reason for this is partly to introduce C++ and partly because 
C++ is one of the few languages that supports both data abstraction and object-oriented programming 
in addition to traditional programming techniques. Issues of concurrency and of hardware support for 
specific higher-level language constructs are ignored in this paper. 
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Programming Paradigms 


Object-oriented programming is a technique for programming — a paradigm for writing "good" pro¬ 
grams for a set of problems. If the term "object-oriented programming language" means anything it 
must mean a programming language that provides mechanisms that support the object-oriented style 
of programming well. 

There is an important distinction here. A language is said to support a style of programming if it pro¬ 
vides facilities that make it convenient (reasonably easy, safe, and efficient) to use that style. A 
language does not support a technique if it takes exceptional effort or exceptional skill to write such 
programs; it merely enables die technique to be used. For example, you can write structured programs 
in Fortran, write type-secure programs in C, and use data abstraction in Modula-2, but it is unneces¬ 
sarily hard to do because these languages do not support those techniques. 

Support for a paradigm comes not only in the obvious form of language facilities that allow direct use 
of the paradigm, but also in die more subtle form of compile-time and/or run-time checks against 
unintentional deviation from the paradigm. Type checking is the most obvious example of this; ambi¬ 
guity detection and run-time checks can be used to extend linguistic support for paradigms. Extra- 
linguistic facilities such as standard libraries and programming environments can also provide 
significant support for paradigms. 

A language is not necessarily better than another because it possesses a feature die other does not. 
There are many examples to the contrary. The important issue is not so much what features a 
language possesses but that the features it does possess are sufficient to support the desired program¬ 
ming styles in the denied application areas: 

■ all features must be cleanly and elegantly integrated into the language 

■ it must be possible to use features in combination to achieve solutions that would otherwise 
have required extra separate features 

■ there should be as few spurious and "special purpose" features as possible 

a a feature should be such that its implementation does not impose significant overheads on pro¬ 
grams that do not require it 

■ a user need only know about the subset of the language explicitly used to write a program 

The last two principles can be summarized as "what you don't know won't hurt you." If there are 
any doubts about die usefulness of a feature it is better left out. It is much easier to add a feature to a 
language than to remove or modify one that has found its way into the compilers or the literature. 

I will now present some programming styles and the key language mechanisms necessary for support¬ 
ing them. The presentation of language features is not intended to be exhaustive. 

Procedural Programming 

The original (and probably still the most commonly used) programming paradigm is: 


Decide which procedures you want; 
use the best algorithms you can find. 
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The focus is on the design of the processing, the algorithm needed to perform the desired computa¬ 
tion. Languages support this paradigm by facilities for passing arguments to functions and returning 
values from functions. The literature related to this way of thinking is filled with discussion of ways 
of passing arguments, ways of distinguishing different kinds of arguments, different kinds of functions 
(procedures, routines, macros,...), etc. Fortran is the original procedural language; Aleol60 Aleol68 
C, and Pascal are later inventions in die same tradition. ’ 6 

A typical example of "good style" is a square root function. It neatly produces a result given an argu¬ 
ment. To do this, it performs a well understood mathematical computation: 6 

double sqrt(double arg) 

{ 

// the code for calculating a square root 

1 


void sorie_function () 

{ 

double root2 - sqrt(2); 

U ... 

1 


From a program organization point of view, functions are used to create order in a maze of algo¬ 
rithms. ° 

Data Hiding 

Over the years, die emphasis in the design of programs has shifted away from the design of pro¬ 
cedures towards the organization of data. Among other things, this reflects an increase in die program 
size. A set of related procedures with the data they manipulate is often called a module. The program¬ 
ming paradigm becomes: 


Decide which modules you went; 
partition the program so that data is hidden in modules. 


This paradigm is also known as tire "data hiding principle." Where there is no grouping of-pro¬ 
cedures with related data the procedural programming style suffices. In particular, the techniques for 
designing "good procedures" are now applied for each procedure in a module. The most common 
example is a definition of a stack module. The main problems that have to be solved for a good solu¬ 
tion are: 

■ provide a user interface for the stack (for example, functions pushO and popO) 

■ ensure that tire representation of the stack (for example, a vector of elements) can only be 
accessed through this user interface 

■ ensure that the stack is initialized before its first use 
Here is a plausible external interface for a stack module: 
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// declaration of the interface of nodule stack of characters 

char pop () ; 

void push(char) ; 

const stack size - 100; 


Assuming that this interface is found in a file called stadch, the "internals" can be defined like this: 
♦include "stack.h* 

static char v[stack_size]; // "static" means local to this file/nodule 

static char* p - v; // the stack is initially ap t y 

char pop() 

{ 

// check for underflow and pop 

1 


void push(char c) 

( 

// check for overflow and push 

} 


It would be quite feasible to change the representation of this stack to a linked list A user does not 
have access to the representation anyway (since v and p were declared static, that is local to the 
file/module in which they were declared). Such a stack can be used like this: 

♦ incl ud e "stack.h" 

void scane_function() 

( 

char c - pop (push ('c')) ; 

if (c !- 'c') error ("inpossible") ; 

} 


Pascal (as originally defined) doesn't provide any satisfactory fatalities for such grouping: the only 
mechanism for hiding a name from "the rest of the program" is to make it local to a procedure. This 
leads to strange procedure nestings and over-reliance on global data. 

C fares somewhat better. As shown in the example above, you can define a "module" by grouping 
related function and data definitions together in a single source file. The programmer can then control 
which names are seen by fire rest of the program (a name can be seen by the rest of the program unless 
it has been declared static). Consequently, in C you can achieve a degree of modularity. However, 
there is no generally accepted paradigm for using this facility and the technique of relying on static 
declarations is rather low level. 

One of Pascal's successors, Modula-2, goes a bit further. It formalizes the concept of a module, making 
it a fundamental language construct with well defined module declarations, explicit control of the 
scopes of names (import/export), a module initialization mechanism, and a set of generally known and 
accepted styles of usage. 
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The differences between C and Modula-2 in this area can be summarized by saying that C only enables 
the decomposition of a program into modules, while Modula-2 supports that technique. 

Data Abstraction 

Programming with modules leads to the centralization of all data of a type under the control of a type 
manager module. If one wanted two stacks, one would define a stack manager module with an inter¬ 
face like this: 

class stack_id; // stackjLd is a type 

// no details about stacks or stackjLds are known here 

stack_id create_stack (int size); // make a stack and return its identifier 
destroy_stack(stack_id); // call when stack is no longer needed 

void push(stack_id, char); 
char pop <stack_id); 


This is certainly a great improvement over die traditional unstructured mess, but "types" implemented 
this way are clearly very different from the built-in types in a language. Each type manager module 
must define a separate mechanism for creating "variables" of its type, there is no established norm for 
assigning object identifiers, a "variable" of such a type has no name known to the compiler or pro¬ 
gramming environment, nor do such "variables" obey the usual scope rules or argument passin g rules. 

A type created through a module mechanism is in most important aspects different from a built-in 
type and enjoys support inferior to the support provided for built-in types. For example: 

void f() 

I 

stack_id si; 
stack id s2; 


si - create_stack (200) ; 

// Oops: forgot to create s2 


char cl - pop(si,push(si,'a')); 
if (cl !- 'C ) error ("impossible"); 


char c2 - pop(s2,push(s2,'a')) ; 
if (c2 !- 'c') error ("impossible"); 


destroy(s2); 

// Oops: forgot to destroy si 

) 

In other words, the module concept that supports the data hiding paradigm enables this style of pro¬ 
gramming, but it does not support it 
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Languages such as Ada, Qu, and C++ attack this problem by allowing a user to define types that 
behave in (nearly) the same way as built-in types. Such a type is often called an abstract data type. 1 The 
progr a mming paradigm becomes: 


Decide which types you want; 
provide a full set of operations for each type. 


Where there is no need for more that one object of a type the data hiding programming style using 
modules suffices. Arithmetic types such as rational and complex numbers are common examples of 
user defined types: 


class coupler { 
double re, im; 
public: 

coupler (double r, double i) { re-r; iut-i; } 

coupler (double r) { re-r; im-0; } // float->canplex conversion 


friend coupler operator*(coupler, coupler); 
friend coupler operator- (coupler, coupler) ; 
friend coupler operator-(coupler); 
friend coupler operator* (coupler, coupler) ; 
friend pompier operator/ (coupler, coupler) ; 
U ... 


// binary minus 
// unary minus 


The declaration of class (that is, user defined type) complex specifies the representation of a complex 
number and the set of operations on a complex number. The representation is private ; that is, re and 
im are accessible only to the functions specified in the declaration of class complex. Such functions 
can be defined like this: 

coupler operator* (coupler al, coupler a2) 

( 

return coupler (al.re+a2.re,al.im*a2.im) ; 

1 


and used like this: 

coupler a - 2.3; 
coupler b — 1/a; 
coupler c - a+b*conpler (1,2.3) ; 
// ... 

c - -(a/b)+2; 


Most, but not all, modules are better expressed as user defined types. For concepts where the 
“module representation" is desirable even when a proper facility for defining types is available, the 
programmer can declare a type and only a single object of that type. Alternatively, a language might 
provide a module concept in addition to and distinct from the class concept. 
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Problems with Data Abstraction 

An abstract data type defines a sort of black box. Once it has been defined, it does not really interact 
with the rest of the program. There is no way of adapting it to new us es except by modifying its 
definition. This can lead to severe inflexibility. Consider defining a type shape for use in a graphics 
system. Assume for the moment that die system has to support aides, triangles, and squares. 

Assume also that you have some classes: 

class point! /* ... */ }; 
class color! /* ... */ ); 

You might define a shape like this: 

enum kind { circle, triangle, square }; 

class shape { 
point center; 
color col; 
kind k; 

// representation of shape 
public: 

point where 0 ! return center; > 

void move (point to) ! center - to; draw!); } 
void draw!) ; 
void rotate (int); 

// more operations 

1 ; 

The “type field" k is necessary to allow operations such as draw!) and rotate!) to determine what kind 
of shape they are dealing with (in a Pascal-like language, one might use a variant record with tag k). 
The function draw!) might be defined like this: 

void shape::draw() 

! 

switch (k) ! 
case circle: 

// draw a circle 
break; 

case triangle: 

// draw a triangle 
break; 

case square: 

// draw a square 

) 

1 

This is a mess. Functions such as draw!) must "know about" all the kinds of shapes there are. There¬ 
fore the code for any such function grows each time a new shape is added to the system. If you define 
a new shape, every operation on a shape must be examined and (possibly) modified. You are not able 
to add a new shape to a system unless you have access to the source code for every operation. Since 
adding a new shape involves "touching" the code of every important operation on shapes, it requires 
great skill and potentially introduces bugs into the code handling other (older) shapes. The choice of 
representation of particular shapes can get severely cramped by the requirement that (at least some of) 
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their representation must fit into the typically fixed sized framework presented by the definition of the 
general type shape. 

Object-Oriented Programming 

The problem is that there is no distinction between the general properties of any shape (a shape has a 
color, it can be drawn, etc.) and the properties of a specific shape (a circle is a shape that has a radius, 
is drawn by a circle-drawing function, etc.). Expressing this distinction and taking advantage of it 
defines object-oriented programming. A language with constructs that allow this distinction to be 
expressed and used supports object-oriented programming. Other languages don't 

The Simula inheritance mechanism provides a solution. First, specify a class that defines the general 
properties of all shapes: 

class shape { 

point center; 
color col; 

// ... 
public: 

point where() { return center; } 

void move (point to) ( center — to; draw(); } 

virtual void draw() ; 

virtual void rotate (int); 

// ... 

1 ; 


The functions for which the calling interface can be defined, but where the implementation cannot be 
defined except for a specific shape, have been marked "virtual" (the Simula and C++ term for "may be 
re-defined later in a class derived from this one"). Given this definition, we can write general func¬ 
tions manipulating shapes: 

void rotate_all(shape* v, int size, int angle) 

// rotate all meirbers of vector "v" of size "size" "angle" degrees 
{ 

for (int i - 0; i < size; i++) v[i] .rotate(angle); 

) 

To define a particular shape, we must say that it is a shape and specify its particular properties 
(including the virtual functions). 

class circle : public shape ( 
int radius; 
public: 

void draw() { /* ... */ 1; 

void rotate(int) {} // yes, the null function 

1 ; 

In C++, class circle is said to be derived from class shape, and class shape is said to be a base of class 
circle. An alternative terminology calls circle and shape subclass and superclass, respectively. 
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The programming paradigm is: 


Decide which dosses you want; 
provide a full set of operations for each class; 
make commonality explicit by using inheritance. 


Where there is no such commonality data abstraction suffices. The amount of commonality between 
types that can be exploited by using inheritance and virtual functions is the litmus test of the applica¬ 
bility of object-oriented programming to an application area. In some areas, such as interactive graph¬ 
ics, there is dearly enormous scope for object-oriented programming. For other areas, such as classical 
arithmetic types and computations based on them, there appears to be hardly any scope for more than 
data abstraction and the facilities needed for the support of object-oriented programming seem 
unnecessary/ ° 

Finding commonality among types in a system is not a trivial process. The amount of commonality to 
be exploited is affected by the way the system is designed. When designing a system, commonality 
must be actively sought, both by designing classes specifically as building blocks for other types and 
by examining classes to see if they exhibit similarities that can be exploited in a common base cbss 


Support for Data Abstraction 


The basic support for programming with data abstraction consists of facilities for defining a set of 
operations for a type and for restricting the access to objects of the type to that set of operations. Once 
that is done, however, the programmer soon finds that language refinements are needed for con¬ 
venient definition and use of the new types. Operator overloading is a good example of this. 

Initialization and Cleanup 

When the representation of a type is hidden some mechanism must be provided for a user to initialize 
variables of that type. A simple solution is to require a user to call some function to initialize a vari¬ 
able before using it. For example: 

class vector { 
int sz; 
int* v; 
public: 

void init (int size); // call init to initialize sz and v 

// before the first use of a vector 

// ... 

}; 


vector v; 

// don't use v here 
v.init(10); 

// use v here 

This is error prone and inelegant. A better solution is to allow the designer of a type to provide a dis¬ 
tinguished function to do the initialization. Given such a function, allocation and initialization of a 
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variable becomes a single operation (often called instantiation) instead of two separate operations. 

Such an initialization function is often called a constructor. In cases where construction of objects of a 
type is non-trivial, one often needs a complementary operation to clean up objects after their last use. 
In C++, such a cleanup function is called a destructor. Consider a vector type: 

class vector { 
int sz; 
int* v; 
public: 

vector(int); 

-vector(); 

int4 operator[] (int index); 

}; 

The vector constructor can be defined to allocate space like this: 

vector::vector (int s) 

{ 

if (8<-0) error ("bad vector size"); 
sz - s; 

v - new int[s]; // allocate an array of "s” integers 

) 

The vector destructor frees the storage used: 

vector::-vector () 

{ 

delete v; // de a llo cate the memory pointed to by v 

> 

C++ does not support garbage collection. This is compensated for, however, by enabling a type to 
maintain its own storage management without requiring intervention by a user. This is a common use 
for the constructor/destructor mechanism, but many uses of this mechanism are unrelated to storage 
management 

Assignment and Initialization 

Controlling construction and destruction of objects is sufficient for many types, but not for all. It can 
also be necessary to control all copy operations. Consider class vector: 

vector vl(100); 

vector v2 - vl; // make a new vector v2 initialized to vl 
vl - v2; // assign v2 to vl 

It must be possible to define fire meaning of the initialization of v2 and the assignment to vl. Alterna¬ 
tively it should be possible to prohibit such copy operations; preferably both alternatives should be 
available. For example: 


// number of elements 
// pointer to integers 

// constructor 
// destructor 
// subscript operator 
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class vector { 
int* v; 
int sz; 
public: 

// ... 

void operator - (vectors); // assignment 

vector (vectors); // ini+-ian»a<--ir,w 

); 

specifies that user defined operations should be used to interpret vector assignment and initialization. 
Assignment might be defined like this: 

vector::operator"(vectors a) // check size and copy elements 

{ 

if (sz !— a.sz) error ("bad vector size for -•); 
for (int i - 0; i<sz; i++) v[i] - a.v[ij; 

1 

Since the assignment operation relies on the "old value" of the vector being assigned to, the initializa¬ 
tion operation must be different For example: 

vector:: vector (vectors a) // initialize a vector from vector 

{ 

sz - a.sz; // same size 

v - new int [sz]; // allocate element array 

for (int i ™ 0; i<sz; i++) v[i] « a.v[i]; // copy elements 

} 

In C++, a constructor of the form X(X&) defines all initialization of objects of type X with another 
object of type X. In addition to explicit initialization constructors of the form X(X&) are used to han¬ 
dle arguments passed "by value" and function return values. 

In C++ assignment of an object of class X can be prohibited by declaring assignment private: 
class X { 

void operator- (XS); // only m embers of X can 

X(XS); // copy an X 

public: 


1 ; 


Ada does not support constructors, destructors, overloading of assignment, or user defined control of 
argument passing and function return. This severely limits the Hass of types that cut be defined and 
forces the programmer back to "data hiding techniques"; that is, the user must design and use type 
manager modules rather than proper types. 
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Parameterized Types 

Why would you want to define a vector of integers anyway? A user typically needs a vector of de¬ 
ments of some type unknown to the writer of the vector type. Consequently the vector type ought to 
be expressed in such a way that it takes the dement type as an argument: 

class vectoKclass T> { // vector of elements of type T 

T* v; 
int sz; 
public: 

vector(int s) 

I 

if (s <- 0) error ("bad vector size”); 
v - new Tlsz - s]; // allocate an array of ”s” ”T”s 

1 

Tfi operator!] (int i); 
int sized { return sz; ) 

U ... 

); 

Vectors of specific types can now be defined and used: 

vector<lnt> vl(100); // vl is a vector of 100 Integers 

vectorcccnpleao v2(200); // v2 is a vector of 200 mnbers 

v2[i] - ocnpler (vl [x], vl [y]) ; 

Ada, Qu, and ML support parameterized types. Unfortunately, C++ does not; the notation here 
is simply devised for illustration. Where needed, parameterized dasses are "faked" macros. 
There need not be any run-time overheads compared with a class where all types involved are 
specified directly. 

Typically a parameterized type will have to depend on at least some aspect of a type parameter. For 
example, some of the vector operations must assume that assigmrent is defined for objects of the 
parameter type. How can one ensure that? One solution to this problem is to require foe designer of 
the parameterized class to state the dependency. For example, "T must be a type for which *is 
defined." A better solution is not to or to take a specification of an argument type as a partial 
specification. A compiler can detect a "missing operation" if it is applied and give an error message 
such as: 


carmot define vector(nonjcopy)::operator[] (non_copyt): 
type noojoopy does not have operator- 

This technique allows the definition of types where die dependency on attributes of a parameter type 
is handled at foe level of the individual operation of foe type. For example, one might define a vector 
with a sort operation. The sort operation might use <, «*, and * on objects of foe parameter type. It 
would still be possible to define vectors of a type for which '<f was not defined as long as foe vector 
sorting operation was not actually invoked. 

A problem with parameterized types is that each instantiation creates an independent type. For exam¬ 
ple, the type vector<char> is unrelated to the type vector<complex>. Ideally one would like to be able 
to express and utilize the commonality of types generated from foe same parameterized type. For 
example, both vector<char> and vector<complex> have a sizeO function that is independent of foe 
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parameterized types aM mheriunee^tTed^u^ " * '*"*'“*' “Pf-** *>* 

Exception Handling 

As programs grow, and especially when libraries are used extensively standards f« r Wa^r _ 

(or more generally: "exceptional circumstances") become important Ada Aleol68 a^dr?™ 5 T** 5 

* ~ s'.rs zxzsszel 

Conader again the vector example What ought to be done when an out of ranee index value « 

^ dastgner of the vector class should be able to provide a default 

class vector ( 

except vectorjrange { 

// define an exception called vector range 
// and ^eciftr default code for handling it 
error ("global: vector range error") ; 
exit (99) ; 

) 

} 

Instad rt c^ ll me an tutor function. T^tenopammrfKI ran invoke the exception h^dij^ ^ 
inti vector: :operator[] (int i) 


) 


if (CXi 11 sz<-i) raise vector range; 
return v{ij; 


M h Unr ‘ W,ed is found; this 


An exception handler may be defined for a specific block: 
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void f() ( 


vector v (10) ; 


try { 

// 


// 

// ... 


int i - g(); 

// 

v[i] - 7; 

// 

) 


except ( 


vector: :vector_range: 


errors here are handled by the local 
exception handler defined below 

g might cause a range error using sane vector 
potential range error 


error("f(): vector range error"); 


) 


return; 

// errors here are handled by the global 
// exception handler defined in vector 


ixxt i - g () ; // g might cause a range error using seme vector 

i] - 7 ; // potential range error 


} 


There are many ways of d efining exceptions and the behavior of exception handlers. The facility 
sketched hae resembles the ones found in Qu and Modula-2+. This style of exception handling can 
be implemented so that code is not executed unless an exception is raised (except possibly for some 
initialization code at the start of a program) or portably across most C implementations by using 
setjmpO and longjmpO- 3 

Could exceptions, as defined above, be completely "faked" in a language such as C++? Unfortunately, 
no. The snag is that when an exception occurs, the run-time stack must be unraveled up to a point 
where a handler is defined. To do this properly in C++ involves invoking destructors defined in the 
scopes involved. This is not done by a C longjmpO and cannot in general be done by the user. 


Coercions 

User-defined coerc ion s, such as the one from floating point numbers to complex numbers implied by 
the constructor complextdouble), have proven unexpectedly useful in C++. Such coercions can be 
applied explicitly or the programmer can rely on the compiler to add them implicitly where necessary 
and unambiguous: 


ccoplex a - complex (1); 

prapi** b - 1; // implicit: 1 -> complex(1) 

a - b+cooplex(2); 

a - bf2; // implicit: 2 -> ccnplex(2) 

Coercions were introduced into C++ because mixed mode arithmetic is the norm in languages for 
numerical work and because most user defined types used for "calculation" (for example, matrices, 
character strings, and machine addresses) have natural mappings to and/or from other types. 

One use of coercions has proven especially useful from a program organization point of view: 
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ccnplex a - 2; 

canplez b - a+2; // interpreted as operator+ (a, corrplex (2)) 

b “ 2+3 ' // interpreted as operator+ (canplex (2) ,a) 

Only one function is needed to interpret "+" operations and the two operands are handled identically 
by the type system. Furthermore, class complex is written without any need to modify the concept of 
integers to enable the smooth and natural integration of the two concepts. This is in contrast to a 
"pure object-oriented system" where die operations would be interpreted like this: 

a+2; // a.operators-(2) 

2+a; // 2.operator)-(a) 

making it necessary to modify class integer to make 2+a legaL Modifying existing code should be 
avoided as far as possible when adding new facilities to a system Typically, object-oriented program¬ 
ming offers superior facilities for adding to a system without modifying existing code. In this case, 
however, data abstraction facilities provide a better solution. 

Iterators 

It has been claimed that a language supporting data abstraction must provide a way of defining con¬ 
trol structures. In particular, a mechanism that allows a user to define a loop over the dements of 
some type containing dements is often needed. This must be achieved without forcing a user to 
depend on details of the implementation of the user defined type. Given a sufficiently powerful 
mechanism for defining new types and the ability to overload operators, this can be handled without a 
separate mechanism for defining control structures. 

For a vector, defining an iterator is not necessary since an ordering is available to a user through the 
indices. I'll define one anyway to demonstrate the technique. There are several possible styles of 
iterators. My favorite relies on overloading the function application operator 0. 

class vector_±terator < 
vectors v; 
int i; 
public: 

vector_iterator (vectors r) { i - 0; v - r; > 

int operator() () { return i<v.sized ? v.elem(i++) : 0; } 

}; 


A vector iterator can now be declared and used for a vector like this: 

vector v(sz); 
vector_iterator next (v) ; 
int i; 

while (i-next()) print (i); 

More than one iterator can be active for a single object at one time, and a type may have several dif¬ 
ferent iterator types defined for it so that different kinds of iteration may be performed. An iterator is 
a rather simple control structure. More general mechanisms can also be defined. For example, the 
C++ standard library provides a co-routine class. 
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For many "container" types, such as vector, one can avoid introducing a separate iterator type by 
defining an iteration mechanism as part of die type itself. A vector might be defined to have a 
"current element": 

class vector { 
int* v; 

1st sz; 

1st current; 
public: 

// ... 

1st next () { return (current+-Ksz) ? v[current] : 0; > 
ist prev() { return (0<—current) ? v[current] : 0; ) 

1 ; 

Then the iteration can be performed like this: 

vector v(sz); 
ist i; 

while (i^r.nextO) print (i); 

This solution is not as general as the iterator solution, but avoids overhead in the important special 
case where only one kind of iteration is needed and where only one iteration at a time is needed for a 
vector. If necessary, a more general solution can be applied in addition to this simple one. Note that 
the "simple" solution requires more foresight from die designer of die container Haw than die iterator 
solution does. The iterator-type technique can also be used to define iterators that can be bound to 
several different container types thus providing a mechanism for iterating over different container 
types with a single iterator type. 

Implementation Issues 

The support needed for data abstraction is primarily provided in the form of language features imple¬ 
mented by a compiler. However, parameterized types are best implemented with support from a 
linker with some knowledge of the language semantics, and exception handling requires support from 
the run-time environment Both can be implemented to meet the strictest criteria for both compile 
time speed and efficiency without compromising generality or programmer convenience. 

As die power to define types increases, programs to a larger degree depend on types from libraries 
(and not just those described in the language manual). This naturally puts greater demands on facili¬ 
ties to express what is inserted into or retrieved from a library, facilities for finding out what a library 
contains, facilities for determining what parts of a library are actually used by a program, etc. 

For a compiled language fatalities for calculating the minimal compilation necessary after a change 
become important It is essential that the linker/loader is capable of bringing a program into memory 
for execution without also bringing in large amounts of related, but unused, code In particular, a 
library /linker /loader system that brings the code for every operation on a type into core Just because 
the programmer used one or two operations on the type is worse than useless. 
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Support for Object-Oriented programming 


The basic support a programmer needs to write object-oriented programs consists of a dass mechan- 
ism with inheritance and a mechanism that allows calls of member functions to depend on the actual 
type of an object (in cases where the actual type is unknown at compile time). The design of the 
member function calling mechanism is critical. In addition, facilities supporting data abstraction tech¬ 
niques (as described above) are important because the arguments for data abstraction and for its 
refinements to support elegant use of types are equally valid where support for object-oriented pro¬ 
gramming is available. The success of both techniques hinges on the design of types and on the ease, 
flexibility, and efficient of such types. Object-oriented programming simply allows user defined 

types to be far more flexible and general than the ones designed using only data a bs traction tech¬ 
niques. 

Calling Mechanisms 

The key language facility supporting object-oriented programming is foe mechanism by which a 
member function is invoked for a given object For example, given a pointer p, how is a call p->«are) 
handled? There is a range of choices. 6 

In languages such as C++ and Simula, where static type checking is extensively used, the type system 
can be employed to select between different calling mechanisms. In C++, two alternatives are avail¬ 
able: 

■ A normal function call: the member function to be called is determined at compile time (through 
a lookup in the compiler's symbol tables) and called using the standard function call mechanism 
with an argument added to identify the object for which the function is called. Where the "stan¬ 
dard function call" is not considered efficient enough, the programmer can declare a function 
inline and tire compiler will attempt to inline expand its body. In this way, one can achieve the 
efficiency of a macro expansion without compromising foe standard function semantics. This 
optimization is equally valuable as a support for data abstraction. 

■ A virtual function call: The function to be called depends on the type of the object for which it is 
called. This type cannot in general be determined until run time. Typically, die pointer p will 
be of some base dass B and the object will be an object of some derived da ss D (as was the c ase 
with the base class shape and the derived class drde above). The call mechanism must look 
into the object and find some information placed there by the compiler to determine which func¬ 
tion f is to be called. Once that function is found, say Dsf, it can be called using the mechanism 
described above. The name f is at compile time converted into an index into a table of pointers 
to functions. This virtual call mechanism can be made essentially as efficvm t as the "normal 
function call" mechanism. In foe standard C++ implementation, only five additional memoty 
references are used. 

In languages with weak static type checking a more elaborate mechanism must be employed. What is 
done in a language like Smalltalk is to store a list of the names of all member functions (methods) of a 
dass so that they can be found at run time: 

■ A method invocation: First the appropriate table of method names is found by examining he 
object pointed to by p. In this table (or set of tables) the string V is looked up to see if the 
object has an fO. If an f() is found it is called; otherwise some error handling takes place. This 
lookup differs from he lookup done at compiler time in a statically checked language in that he 
method invocation uses a method table for he actual object 

A method invocation is ineffident compared with a virtual function call, but more flexible. Since static 
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type checking of arguments typically cannot be done for a method invocation, the use of methods 
must be supported by dynamic type checking. 

Type Checking 

The shape example showed the power of virtual functions. What, in addition to this, does a method 
invocation mechanism do for you? You can attempt to invoke any method for any object. 

The ability to invoke any method for any object enables the designer of general purpose libraries to 
push the responsibility for handling types onto the user. Naturally this simplifies die design of 
libraries. For example: 

class stack { // assume class any has a member rwyt 

any* v; 

void push(any* p) 

{ 

p->next - v; 

▼ - p; 

) 

any* pop() 

{ 

if (v — 0) return errorjobj; 
any* r - v; 
v - v->next; 
return r; 

> 

>; 

It becomes die responsibility of the user to avoid type mismatches like this: 

stack<any*> cs; 

cs.push (new Saab900); 
cs.push (new Saab37B); 

plane* p - (plane*)cs.pop() ; 
p->takeoff (); 

p - (plane*)cs.pop(); 

p->takeo£f (); // Oops! Run tine error: a Saab 900 is a car 

// a car does not have a takeoff method. 


An attempt to use a car as a plane will be detected by the message handler and an appropri a te error 
handler will be called. However, that is only a consolation when the user is also the programmer. 
The absence of static type checking makes it difficult to guarantee that errors of this rfa«s are not 
present in systems delivered to end-users. Naturally, a language designed with methods and without 
static types can express this example with fewer keystrokes. 

Combinations of parameterized classes and the use of virtual functions can approach the flexibility, 
ease of design, and ease of use of libraries designed with method lookup without relaxing the static 
type checking or incurring measurable run time overheads (in time or space). For example: 
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stack<plane *> cs; 


cs.push(new Saab900); 

cs.push(new Saab37B); 

plane* p - cs.popO; 
p->takeoff () ; 

p - cs.popO; 
p->takeoff () ; 


// Coup lie tame error: 

// type mismatch: car* passed, plane* expected 


// fine: a Saab 37B is a plane 


The use of static type checking and virtual function calls leads to a somewhat different style of pro¬ 
gramming than does dynamic type checking and method invocation. For example, a Simula or C++ 
dass specifies a fixed interface to a set of objects (of any derived class) whereas a Smalltalk class 
specifies an initial set of operations for objects (of any subclass). In other words, a Smalltalk Hacc is a 
minimal specification and file user is free to try operations not specified whereas a C++ dass is an 
exact specification and the user is guaranteed that only operations specified in file dass declaration 
will be accepted by file compiler. 

Inheritance 

Consider a language having some form of method lookup without having an inheritance mechanism. 
Could that language be said to support object-oriented programming? I think not Clearly, you could 
do interesting things with the method table to adapt die objects' behavior to suit conditions. However, 
to avoid chaos, there must be some systematic way of associating methods and file data structures 
they assume for their object re pr ese ntation. To enable a user of an object to know what kind of 
behavior to expect, there would also have to be some standard way of expressing what is common to 
the different behaviors the object might adopt This “systematic and standard way" w ould be an 
inheritance mechanism. 

Consider a language having an inheritance mechanism without virtual functions or methods. Could 
that language be said to support object-oriented programming? I think not: the shape example does 
not have a good solution in such a language. However, such a language would be noticeably more 
powerful than a “plain" data abstraction language. This contention is supported by file observation 
that many Simula and C++ programs are structured using class hierarchies without virtual functions. 
The ability to express commonality (factoring) is an extremely powerful tool. For example, the prob¬ 
lems associated with the need to have a common representation of all shapes could be sol ved No 
union would be needed. However, in file absence of virtual functions, the programmer would have to 
resort to the use of “type fields" to determine actual types of objects, so the problems with the lack of 
modularity of file code would remain. 5 

This implies that class derivation (subclassing) is an important programming tool in its own right. It 
can be used to support object-oriented programming, but it has wider uses. This is particularly true if 
one identifies the use of inheritance in object-oriented programming with the idea that a base Hacc 
expresses a general concept of which all derived classes are specializations. This idea captures only 
part of the expressive power of inheritance, but it is strongly encouraged by languages where every 
member function is virtual (or a method). Given suitable controls of what is inherited (see The C++ 
Programming Language ), class derivation can be a powerful tool for creating new types. Given a class, 
derivation can be used to add and/or subtract features. The relation of the resulting to its h=»o=. 
cannot always be completely described in terms of specialization; factoring may be a better term. 
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Derivation is another tool in the hands of a programmer and there is no foolproof way of predicting 
how it is going to be used — and it is too early (even after 20 years of Simula) to tell which uses are 
simply mis-uses. 

Multiple Inheritance 

When a class A is a base of class B, a B inherits the attributes of an A; that is, a B is an A in addition 
to whatever rise it might be. Given this explanation it seems obvious that it might be useful to have a 
class B inherit from two base cl a s s es A1 and A2. This is called multiple inheritance. 

A fairly standard example of the use of multiple inheritance would be to provide two library dasses 
displayed and task for representing objects under the control of a display manager and co-routines 
under the control of a scheduler, respectively. A programmer could then create n aC o»c surf, as 

class ny_displayed_task : public displayed, public task { 

// ay stuff 

1 ; 


class nyjtask : public task { // not displayed 

// ay stuff 

1 ; 


class ayjdisplayed : public displayed { // not a task 
// ay stuff 


>; 


Using (only) single inheritance only two of these three choices would be open to the programmer. 
This leads to either code replication or loss of fl e x ibi lity — and typically both. In C++ this example 
can be handled as shown above with no significant overheads (in time or mace) compared to single 
inheritance and without sacrificing static type checking. 


Ambiguities are handled at compile time: 


Class A ( public: f(); ... }; 

Class B { public: f(); ... ); 
class C : public A, public B { ... }; 


void g() { 

C* p; 

p-»f(); // error: ambiguous 

1 


In this, C++ differs from the object-oriented Lisp dialects that support multiple inheritance. In these 
lisp dialects ambiguities are resolved by considering die order of declarations significant, by consider¬ 
ing objects of the same name in different base dass es identical, or by combining methods of the same 
name in base cl ass es into a more complex method of die highest «■!«« 

In C++, one would typically resolve the ambiguity by adding a function: 
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class C : public A, public B { 
public: 

ft) 

{ 

// C's own stuff 
A::fO; 

B::f(); 

} 

} 


In addition to this fairly straightforward concept of independent multiple inheritance there appears to 
be a need for a more general mechanisn for expressing dependencies between H*co»c in a multiple 
inheritance lattice. In C++, the requirement that a sub-object should be shared by all other sub-objects 
in a class object is expressed through the mechanism of a virtual base dass: ^ 

class It ( ... ); 

class Bwindow // window with border 

: public virtual w 
{ ... 1; 

class Mfindow // window with 

: public virtual W 

1 ... 1 ; 

class EM f // window with border anri m»nn 

: public Bwindow, public Mwindow 

{ ... }; 

Here the (angle) window sub-object is shared by the Bwindow and Bwindow sub-objects of a BMW. 
The Lisp dialects provide concepts of method combination to ease programming using such compli¬ 
cated class hierarchies. C++ does not. r 

Encapsulation 

Consider a dass member (either a data member or a function member) that needs to be protected from 
"unauthorized access." What choices can be reasonable for delimiting the set of functions that may 
access that member? The "obvious" answer for a language supporting object-oriented programming is 
aU operations defined for this object"; that is, all member functions. A non-obvious implication of 
this answer is that there cannot be a complete and final list of all functions that may ar-recc the pro¬ 
tected member since one can always add another by deriving a new class from the protected member's 
class and define a member function of that derived class. This approach combines a large degree of 
protection from accident (since you do not easily define a new derived Hass "by accident") with foe 
flexibility needed for "tool building" using das s hierarchies (since you can "grant yourself access ' to 
protected members by deriving a class). 

Unfortunately, the "obvious" answer for a language oriented towards data abstraction is different 
'list the functions that need access in the dass declaration." There is nothing special about these func¬ 
tions. In particular, they need not be member functions. A non-member function with access to 
private class members is called a friend in C++. Class complex above was defined using func¬ 

tions. It is sometimes important that a function may be specified as a friend in more dan one dass 
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Having the full list of members and friends available is a great advantage when you are trying to 
understand the behavior of a type and especially when you want to modify it. 

Here is an example that demonstrates some of the range of choices for encapsulation in C++: 
class B { 

// class members are default private 

lot il; 
void fl (); 
protected: 
int 12; 
void £2 (); 
public: 

int 13; 
void f3{); 

friend void g(B*); // airy function can be designated as a friend 

1 ; 

Private and protected members are not generally accessible: 

void h(B* p) 

{ 

p->fl(); // error: B::fl is private 

p->f2(); // error: B::£2 is protected 

p->f3(); // fine: B::fl is public 

} 

Protected members, but not private members, are accessible to members of a derived class: 

class D : public B { 
public: 

void g() 

{ 

fl(); // error: B::fl is private 

£2(); // fine: B::£2 is protected, but D is derived from B 

f3(); // fine: B::£l is public 

} 

); 

Friend functions have access to private and prot e cted members just like member functions: 

void g(B* p) 

{ 

p->fl{); // fine: B::fl is private, but g() is a friimd of B 

p->f2(); // fine: B::f2 is protected, but g() is a friend of B 

p->f3(); // fine: B::fl is public 

} 

Encapsulation issues increase dramatically in importance with die size of the program and with the 
number and geographical dispersion of its users. See The C++ Programming Language for more detailed 
discussions of language support for encapsulation. 
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Implementation Issues 

The support needed for object-oriented programming is primarily provided by the run-time system 
and by the programming environment. Part of the reason is that object-oriented programming builds 
on the language improvements already pushed to their limit to support for data abstraction so that 
relatively few additions are needed. 6 

The use of object-oriented programming blurs the distinction between a programming language and its 
environment further. Since more powerful special- and general-purpose user defined types can be 
defined their use pervades user programs. This requires further development of the run-time system, 
library facilities, debuggers, performance measuring, monitoring took, etc Ideally these are integrated 
into a unified programming environment. Smalltalk is the best example of this. 


Limits to Perfection 


A major problem with a language defined to exploit the techniques of data hiding, data abstraction, 
and object-oriented programming is that to claim to be a general purpose programming language it 
must 


■ run on traditional machines 

■ coexist with traditional operating systems 

■ compete with traditional programming languages in terms of run time efficiency 

■ cope with every major application area 

This implies that facilities must be available for effective numerical work (floating point arithmetic 
without overheads that would make Fortran appear attractive), and that facilities must be available for 
access to memory in a way that allows device drivers to be written. It must also be possible to write 
calls that conform to the often rather strange standards required for traditional operating system inter¬ 
faces. In addition, it should be possible to call functions written in other languages from an object- 
oriented programming language and for functions written in the object-oriented programming 
language to be called from a program written in another language. 

Another implication is that an object-oriented programming language cannot completely rely on 
mechanisms that cannot be efficiently implemented on a traditional architecture and still expect to be 
used as a general purpose language. A very general implementation of method invocation can be a 
liability unless there are alternative ways of requesting a service. 

Similarly, garbage collection can become a performance and portability bottleneck. Most object- 
oriented programming languages employ garbage collection to simplify the task of the programmer 
and to reduce die complexity of the language and its compiler. However, it ought to be possible to 
use garbage collection in non-critical areas while retaining control of storage use in areas where it 
matters. As an alternative, it is feasible to have a language without garbage collection and then pro¬ 
vide sufficient expressive power to enable the design of types that maintain their own storage. C++ is 
an example of this. 

Exception handling and concurrency features are other potential problem areas. Any feature that is 
best implemented with help from a linker is likely to become a portability problem. 
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The alternative to having 'low level" features in a language is to handle major application areas using 
separate 'low level" languages. 


Conclusions 


Object-oriented programming is programming using inheritance. Data abstraction is programming 
using user defined types. With few exceptions, objectoriented programming can and ought to be a 
superset of data abstraction. These techniques need proper support to be effective. Data abstraction 
primarily needs support in the form of language features and object-oriented programming needs 
further support from a programming environment. To be general purpose, a language supporting 
data abstraction or objectoriented programming must enable effective use of traditional hardware. 
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Footnotes 


1. I prefer the term "user defined type": "Those types are not "abstract’'; they are as red as int and 
float" — Doug Mcllroy. An alternative definition of abstract data types would require a 
mathematical "abstract" specification of all types (both built-in and user defined). What are 
referred to as types in this paper would, given such a specification, be concrete specifications of 
such truly abstract entities. 

2. However, more advanced mathematics may benefit horn the use of inheritance: Fields are spe¬ 
cializations of rings; vector spaces a special case of modules. 

3. See the C library manual for your system. 

4. This style also relies on the existence of a distinguished value to rep r esen t "end of iteration." 
Often, in particular for C++ pointer types, 0 can be used. 

5. This is the problem with Simula's inspect statement and the reason it does not have a counter¬ 
part in C++. 

6. This assumes that an object-oriented language does indeed support data abstraction. However, 
the support for data abstraction is often deficient in such languages. Conversely, languages that 
support data abstraction are typically deficient in their support of object-oriented programming. 


Object-Oriented Programming 


4-25 







5 

Multiple Inheritance 



Multiple Inheritance for C++ 

5-1 


Abstract 

S-1 


Introduction 

S-1 


Multiple Inheritance 

S-1 


C++ Implementation Strategy 

5-2 


Multiple Base Classes 

5-4 


■ Object Layout 

5-4 


■ Member Function Call 

5-5 


■ Ambiguities 

5-6 


■ Casting 

5-6 


■ Zero Valued Pointers 

5-7 


Virtual Functions 

5-6 


a Implementation 

5-6 


■ Ambiguities 

5-9 


Multiple Inclusions 

5-10 


■ Multiple Sub-objects 

5-10 


■ Naming 

5-11 


a Casting 

5-11 


Virtual Base Classes 

5-12 


■ Representation 

5-13 


■ Virtual Functions 

5-14 


Constructors and Destructors 

5-15 


Visbility 

5-17 


Overheads 

5-17 


But is it Simple to Use? 

5-18 


Conclusions 

5-19 


Footnotes 5-20 


Table of Contents 


I 










Multiple Inheritance for C++ 


NOTE 

~T 


This chapter is taken directly from a paper by Bjarne Stroustrup. 


Abstract 

Multiple Inheritance is die ability of a class to have more than one base Hass (super dass). In a 
language where multiple inheritance is s u pported a program can be structured as a set of inheritance 
lattices instead of (just) as a set of inheritance trees. This is widely believed to be an important struc¬ 
turing tool. It is also widely believed drat multiple inheritance complicates a progr am ming 1 assuage 
significantly, is hard to implement, and is expensive to run. I will demonstrate that none of these last 
three conjectures are true. 


Introduction 

This paper describes an implementation of a multiple inheritance mechanism for C++ (described in The 
C++ Progra m ming Language). It provides only the most rudimentary explanation of what multiple 
inheritance is in general and what it can be used for. The particular variation of the general concept 
implemented here is primarily explained in terms of this implementation. 1 

First a bit of background on multiple inheritance and C++ implementation technique is pre s e nted, then 
the multiple inheritance scheme implemented for C++ is introduced in two stages: 

■ the basic scheme for multiple inheritance, the basic strategy for ambiguity resolution, and the 
way to implement virtual functions 

■ handling of classes included more than once in an inheritance lattice; die programmer has the 
choice whether a multiply included base class will result in one or more sub-objects being 
created 

Finally, some the complexities and overheads introduced by this multiple inheritance scheme are sum¬ 
marized. 


Multiple Inheritance 

Consider writing a simulation of a network of computers. Each node in the network is represented by 
an object of class Switch, each user or computer by an object of dass Terminal, and each communica¬ 
tion lfoe by an object of dass Line. One way to monitor die simulation (or a real network of die same 
structure) would be to display the state of objects of various classes on a screen. Each object to be 
displayed is represented as an object of dass Displayed. Objects of class Displayed are under control 
of a display manager that ensures regular update of a screen and/or data base. The dasses Terminal 
and Switch are derived from a dass Task that provides the basic facilities for co-routine style 
behavior. Objects of class Task are under control of a task manager (scheduler) that manages the real 
processors). 
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Ideally Task and Displayed are classes from a standard library. If you want to display a terminal 
class Terminal must be derived from class Displayed. Class Terminal, however, is already derived 
from class Task. In a single inheritance language, such as C++ or Simula67, we have only two ways 
of solving this problem: deriving Task from Displayed or deriving Displayed from Task. Neither is 
ideal since they both create a dependency between the library versions of two fundamental and 
independent concepts. Ideally one would want to be able to choose between saying that a Terminal is 
a Task and a Displayed; that a Line is a Displayed but not a Task; and that a Switch is a Task but not 
a Displayed. 

The ability to express this using a class hierarchy, that is, to derive a class from more than one base 
class, is usually referred to as multiple inheritance. Other examples involve the representation of vari¬ 
ous kinds of windows in a window system and the representation of various kinds of processors and 
compilers for a multi-machine, multi-environment debugger. 

In general, multiple inheritance allows a user to combine independent (and not so independent) con¬ 
cepts represented as classes into a composite concept represented as a derived dass. A common way 
of using multiple inheritance is for a designer to provide sets of base dasses with foe intention that a 
user creates new classes by choosing base classes from each of foe relevant sets. Thus a programmer 
creates new concepts using a recipe like "pick an A and/or a B." In the window example, a user 
might specify a new kind of window by selecting a style of window interaction (from the set of 
interaction base dasses) and a style of appearance (from foe set of base definin g display 
options). In the debugger example, a programmer would specify a debugger by choosing a processor 
and a compiler. 

Given multiple inheritance and N concepts each of which might somehow be combined with one of M 
other concepts, we need N+M classes to represent all the combined concepts. Given only single inheri¬ 
tance, we need to replicate information and provide N+M+N*M dasses. Single inheritance handles 
cases where N*=l or M=l. The usefulness of multiple inheritance for avoiding replication hinges on 
the importance of examples where the values of N and M are both larger than 1. It appears that 
examples with N>=2 and M>=2 are not uncommon; foe window and debugger examples rWcri bed 
above will typically have both N and M larger than 2. 


C++ Implementation Strategy 

Before discussing multiple inheritance and its implementation in C++1 will first describe the main 
points in foe traditional implementation of foe C++ single inheritance <+»« concept 

An object of a C++ dass is represented by a contiguous region of memory. A pointer to an object of a 
dass points to foe first byte of that region of memory. The compiler turns a call of a member function 
into an "ordinary" function call with an "extra" argument; that "extra" argument is a pointer to foe 
object for which the member function is called. 

Consider a simple dass A? 
class A { 

int a; 

void f (int i); 

1 ; 

An object of class A will look like this 
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I int a; | 


No information is placed in an A except the integer a specified by the user. No information relating to 
(non-virtual) member functions is placed in the object. ° 

A call of the member function Auf: 

A* pa; 
pa->£(2); 

is transformed by the compiler into an "ordinary function call": 


f_FlA(pa, 2) ; 


Objects of derived classes are composed by concatenating the members of the classy involved: 

class A { int a; void f(int); }; 
c la ss B : A { int b; void 9(int); }; 
class C : B { int c; void h(int); }; 

Again, no "housekeeping" information is added, so an object of Ha y? C looks like this: 


I int a; | 

1 int b; | 

I int c; | 


The compiler "knows" the position of all members in an object of a derived rla«re exactly as it does for 
an object of a simple class and generate s the same (optimal) code in both cases 

Implementing virtual functions involves a table of functions. Consider 
class A { 

int a; 

virtual void f(int); 
virtual void 9 (int); 
virtual void h(int); 

1 ; 


class B : A { int b; void 9 (int); }; 
class C : B { int c; void h(int); ); 

In this case, a table of virtual functions, the vtbl, contains the appropriate functions for a given 
and a pointer to it is placed in every object A class C object looks like this: 
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I int a; | vtbl: 

I vptr.> --- 

I int b; | | A: :f I 

I int c; I | B::g | 

- I C::h | 


A call to a virtual function is transformed into an indirect call by die compiler. For example . 


C* pc; 
pc->g(2); 

becomes something like: 

<*(pc->vptr[l])) (pc,2); 

A multiple inheritance mechanism-for C++ must preserve the efficiency and the key features of this 
implementation scheme. 


Multiple Base Classes 

Given two classes 

class A { ... } ; 
class B { ... } ; 

one can design a third using both as base classes: 
class C : A , B { ... }; 

This means that a C is an A and a B. One might equivalently 3 define C like this: 
class C : B , A { ... }; 

Object Layout 

An object of class C can be laid out as a contiguous object like this: 
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A part 


B part 


C part 


Accessing a member of classes A, B or C is handled exactly as before: the compiler knows foe location 
in the object of each member and generates the appropriate code (without spurious indirections or 
other overhead). 

Member Function Call 

Calling a member function of A or C is identical to what was done in foe an gle inheritance ray? Cal¬ 
ling a member function of B given a C* is slightly mote involved: 


C* pc; 

pc->bf(2); // assume that bf is a member of B 

// and that C has no member natn—j bf 
// exc ep t the one inherited from B 


Naturally, BsbfO expects a B* (to become its this pointer). To provide it, a constant must be to 
pc. This constant, delta(B), is foe relative position of foe B part of C. This delta is known to the com¬ 
piler that transforms foe call into: 

bf_FlB ((B*) ((char*) pc+delta (B)) ,2) ; 

The overhead is one addition of a constant per call of this kind. During the execution of a member 
function of B foe function's this pointer points to foe B part of C: 


pc 


B: :bf's this 


> 


> 


A part 


B part 


C part 


Note that there is no space penalty involved in using a second base class and that the minimal time 
penalty is incurred only once per call. 
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Ambiguities 

Consider potential ambiguities if both A and B have a public member ii: 

class A { lut ii; }; 
class B { char* ii; }; 
class C : A, B { }; 

In this case C will have two members called ii. Ami and Bnii. Then 
C* pc; 

pc->ii; // error: A::ii or B::ii ? 

is illegal since it is ambiguous. Such ambiguities can be resolved by explicit qualification: 

pc->A::ii; // C's A's ii 

pc->B::ii; // C's B's ii 

A cimilar ambiguity arises if both A and B have a function f(): 

class A { void £(); ); 
class B { int f(); 1; 
class C : A, B { ); 

C* pc; 

pc->f(); // error: A::f or B::£ ? 

pc->A: :£(); // C's A's f 

pc->B::f(); // C's B's £ 

As an alternative to specifying which base dass in each call of an fO, one might define an ft) for C. 
CrfO might call the base dass functions. For example: 

class C : A, B { 

int f() { A: :£(); return B::£(); ) 

1; 


C* pc; 

pc->f(); // C::f is called 


Casting 

Explicit and implicit casting may also involve modifying a pointer value with a delta: 
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C* pc; 



B* pb; 



pb - (B*)pc; 

// I* - 

(B*) ((char*) pc+delta (B)) 

Pb - pc; 

// pb - 

(B*) ((char*) pc+delta (B)) 

pc - pb; 

// error 

: cast needed 

pc - (C*)pb; 

//pc - 

(C*) ((char*) pb-delta (B)) 


Casting yields the pointer referring to the appropriate part of the same object 


pc ..•> 


pb ...> 


A part 


B part 


C part 


Comparisons are interpreted in the same way: 


pc — pb; 


// that is, pc — (C*)pb 

// or equivalently (B*)pc — pb 


// that is, (B*){(char*)pc+delta(B)) — pb 
// or equivalently pc — (C*) ((char*) pb-delta (B)) 


Note that in both C and C++ casting has always been an operator that produced one value given 
another rather than an operator that simply reinterpreted a bit pattern. For example, on almost all 
machines (intU causes code to be executed; (floatMinOJI is not equal to JZ. Introducing multiple inher¬ 
itance as described here will introduce cases where (char*XB*)vl=(char*>v for some pointer type B*. 
Note, however, that when B is a base cl a ss of C, (B*)v«*(C*>v»*v. 

Zero Valued Pointers 

Pointers with the value zero cause a separate problem in die context of multiple base cl a s s es . Consider 
applying the rules pr e s e nted above to a zero-valued pointer 

C* pc - 0; 

B* pb - 0; 
if (pb — 0) ... 

pb - pc; // pb - (B*) ((char*) pc+delta (B)) 

if (pb — 0) ... 

The second test would fail since pb would have die value (B*M(char*)0+delta(B)). 
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The solution is to elaborate the conversion (casting) operation to test for the pointer-value (fc 


C* pc - 0; 

B* pb - 0; 
if <pb — 0) ... 

pb - pc; // pb - (pc —< ?) ?0: (B*) ((char*) pc+delta (B)) 

if (pb — 0) ... 

The added complexity and run-time overhead are a test and an increment 


Virtual Functions 


Naturally, member functions may be virtual: 


class A ( virtual void f (); 
class B ( virtual void f () ? 
class C : A , B { void f(); 


); 

virtual void 9(); >; 
>; 


A* pa - new C; 
B* pb - new C; 
C* pc - new C; 


pa->f(); 

pb->f(); 

pc->f(); 

All these calls will invoke CsfO. This follows directly from die definition of virtual since class C is 
derived from class A and from class B. 

Implementation 

On entry to C=f, the this pointer must point to the beginning of foe C object (and not to the B part). 
However, it is not in general known at compile time that the B poi n ted to by pb is part of a C so die 
compiler cannot subtract die constant delta(B). Consequently delta(B) must be stored so that it can be 
found at run time. Since it is only used when calling a virtual function die obvious place to store it is 
in the table of virtual functions (vtbl). For reasons that will be explained below die delta is stored 
with each function in the vtbl so that a vtbl entry will be of the form: 

struct vtbljentzy ( 

void (*fct) (); 

lnt delta; 

1 ; 

An object of class C will look like this: 
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I vtbl: 

vptr.> —. 

A part I | C::£ | 0 


I vtbl: 

vptr.> - . . 

B part | I C::f | -delta(B) | 

I I B::g | 0 | 


C part 


pb->£(); // call of C::f: 

// register vtbl_entry* vt - tpb->vtbl [index (f)]; 
// <*vt->fct) ((B*) ((char*) pb+vt->delta)) 


Note that the object pointer may have to be adjusted to point to the correct sub-object before looking 
for the member pointing to the vtbl. Note also that each combination of base class and derived class 
has its own vtbl. For example, the vtbl for B in C is different from the vtbl of a separately allocated 
B. This implies that in general an object of a derived class needs a vtbl for each base class plus one 
for the derived class. However, as with single inheritance, a derived class can share a vtbl with its 
first base so that in the example above only two vtbls are used for an object of type C (one for A in C 
combined with C's own plus one for B in C). 

Using an int as the type of a stored delta limits the size of a single object; that might not be a bad 
thing. 


Ambiguities 


The following demonstrates a problem: 

class A { virtual 
class B ( virtual 
class C : A , B { 

void £() 
void £() 
void £() 

C* pc - new C; 


pc->f(); 


pe->A: :f (); 
pc->B: :£(); 



Explicit qualification "suppresses" virtual so fire last two calls really invoke the base class functions. 

Is this a problem? Usually, no. Either C has an fO and there is no need to use explicit qualification or 
C has no f() and die explicit qualification is necessary and correct Trouble can occur when a function 
f() is added to C in a program that already contains explicitly qualified names. In the latter case one 
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could wonder why someone would want to both declare a function virtual and also call it using expli¬ 
cit qualification. If fO is virtual, adding an fO to the derived class is clearly the correct way of resolv¬ 
ing the ambiguity. 

The case where no Csf is declared cannot be handled by resolving ambiguities at the point of rail 
Consider. 


class A { virtual void £0; }; 
class B { virtual void £(); }; 

class C : A , B { 1; // error: C::f needed 


C* pc - new C; 

pc->£(); // anbiguous 

A* pa - pc; // implicit conversion of C* to A* 

pa->f<); // not anbiguous: calls A::f(); 

The potential ambiguity in a call of fO is detected at the point where the virtual function tables for A 
and B in C are constructed. In other words, the declaration of C above is illegal because it would 
allow calls, such as p*->fO, which are unambiguous only because type information has been 'Tost" 
through an implicit coercion; a call of fO for an object of type C is ambiguous. 


Multiple Inclusions 


A class can have any number of base classes. For example. 
Class A : Bl, B2, B3, B4, B5, B6 { ... }; 


It illegal to specify the same class twice in a list of base classes. For example, 
class A : B, B {_}; // error 

The reason for this restriction is that every access to a B member would be ambiguous and therefore 
illegal; this restriction also simplifies the compiler. 

Multiple Sub-objects 

A class may be included more than once as a base class. For example: 


class L 
class A 
class B 
class C 


... }; 

L { ... }; 

L { ... }; 

A , B {...}; 


In such cases multiple objects of the base class are part of an object of the derived class. For example, 
an object of class C has two L's: one for A and one for B: 
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L part (of A) 


A part 


L part (of B) 


B part 


C part 


IS ??' ofL asa link dass for a Simula-style linked list In this case a Com 
oe on both the list of As and the list of Bs. 

Naming 


Assume that dass L in the example above has a member m. 
The obvious answer is "by explicit qualification": 


How could a function Cnf refer to Lsm? 


void C: :f () (A::m-B::m; } 

This will work nicely provided neither A nor B has a member m (except die one they inherited from 
L). If necessary, the qualification syntax of C++ could be extended to allow the more 

void C: :f () { A: :L: :m - B: :L: :au > 


Casting 

Consider the example above again. The fact that there are two copies of L makes casting (both explicit 
and implicit) between L* and C* ambiguous; and consequently fliyi- 


C* pc - new C; 
L* pi m pc; 
pi - (L*)pc; 
pi - (L*) (A*) pc; 
PC - pi; 

PC - (L*)pl; 

PC - (C*) (A*) pi; 


// error: ambiguous 
// error: still ambiguous 
// The L in C's A 
// error: ambiguous 
// error: still ambiguous 
// The C containing A's L 
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I don't expect this to be a problem. The place where this will surface is in rasps where As (or Bs) are 
handled by functions expecting an L; in these cases a C will not be acceptable despite a C being an A: 


extern f(L*); // sene standard function 


A aa; 
C cc; 


// fine 

// error: ambiguous 
// fine 


f(6aa); 
f(6cc); 
f ((A*) *cc); 


Casting is used for explicit disambiguation. 


Virtual Base Classes 

When a class C has two base classes A and B these two base rl*«pc give rise to separate sub-objects 
that do not relate to each other in ways different from any other A and B objects. I call this i ndepen - 
deni multiple inheritance. However, many proposed uses of multiple inheritance assume a dependence 
among base class e s (for example, the style of providing a selection of features for a window described 
in this chapter under "Multipie Inheritance"). Such dependencies can be expressed in terms of an 
object shared between the various derived classes. In other words, there must be a way of specifying 
that a base class must give rise to only me object in the final derived rla« even if it is mentioned as a 
base class several times. To distinguish this usage from independent multiple inheritance such base 
classes are specified to be virtual: 

class AN : virtual W { ... } ; 
class BN : virtual W { ... }; 

Class CW : AN , BW { ... }; 

A single object of class W is to be shared between AW and BW; that is, only one W object must be 
included in CW as the result of deriving CW from AW and BW. Except for giving rise to a unique 
object in a derived class, a virtual base class behaves exactly like a non-virtual base dass 

The "virtualness" of W is a property of the derivation specified by AW and BW and not a property of 
W itself. Every virtual base in an inheritance DAG refers to fire same object This object is constructed 
once using a default constructor. A dass that can only be constructed given an argument cannot be a 
virtual base. 

A dass may be both a normal and a virtual base in an inheritance DAG: 

class A : virtual L { ... ); 
class B :-virtual L ( ... ); 
class C : A , B { ... }; 
class D : L, C { ... }; 

A D object will have two sub-objects of dass L, one virtual and one "normal." 
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Representation 

“aw* , v ? rtua l if 86 dass w °^ ect cannot ** placed in a fixed position relative to 

both AW andBWm all objects. Consequently, a pointer to W must be stored in all objects directly 

accessing the W object to allow access independently of its relative position. For example- 7 


AW* paw » 
BW* pbw - 
CW* pew - 

new AW; 
new BW; 
new CW; 



paw ..> | 




1 

AW part 

1 

# 

1 


1 

V 

1 


K... 


1 

W part 

1 


1 


1 






pbw ..> | 




1 

BW part 

1 


1 


1 

V 

1 


|<... 

’ 

1 

W part 

1 



1 

1 

AW part 

1 


1 


1 

V 

1 

1 

BW part 

1 


1 


1 

V 

1 


1 


1 

CW part 

1 


1 


1 

V 

1 


K... 


1 

W part 

1 


1 


1 



A class can have an arbitrary number of virtual base dasses 
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One can cast from a derived class to a virtual base class, but not from a virtual base class to a derived 
class The former involves following the virtual base pointer; the latter cannot be done given the infor¬ 
mation available at run time. Storing a "back-pointer" to the enclosing objects) is non-tnvial in gen¬ 
eral and was considered unsuitable for C++ as was the alternative strategy of dynamically keeping 
track of the objects "for which" a given member function invocation operates. 


Virtual Functions 

Consider: 

class N { 

virtual void £(); 
virtual void 9 O; 
virtual void h(); 
virtual void k 0 ; 

1 ; 


t-i AM : virtual W { void g() 
class BW : virtual W { void f 0 
CM : AM , BM { void h() ; 


CM* pew - new CM; 

pcw->f 0 ; // BM:: f () 

pcw->g(); // AM::g() 

pcw->h0; // CM::h() 

((AM*)pew) —>f (); // BM: :£(); 

A CW object might look like this: 


); 

>; 
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V | 

1 

AW part 1 

1 

vtbl: 

v 1 

V 1 

1 

BW part 1 

1 

1 

CW part 1 

1 

...>l 

vptr. 


1 

1 

I AW: :g | -delta (W) 

1 

W part | 

1 CW::h | -delta (W) 

1 

1 

1 W::k | 0 


In general, the delta stored with a function pointer in a vtbl is die delta of the defining die func¬ 

tion minus die delta of die class for which die vtbl is constructed. 

If W has a virtual function f that is re-defined in both AW and BW but not in CW an ambiguity 
results. Such ambiguities are easily detected at the point where CW's vtbl is constructed. 

The rule for detecting ambiguities in a class lattice, or more precisely a directed acyclic graph (DAG) of 
cl ass es , is that all redefinitions of a virtual function from a virtual base rlaoc must occur on a single 
path through the DAG. The example above can be drawn as a DAG like this: 

- > W { f g h k ) <... 

I I 

A A 

I I 

aw { g ) bw { f ) 

I I 

A A 


• • • • CW { h } • • • 

Note that a call "up" through one path of die DAG to a virtual function may result in the call of a 
function (redefined) in another path (as happened in die call «AW*)pcw)->fO in the example above). 


Constructors and Destructors 

Constructors for base class es are called before the constructor for their derived Destructors for 
base classes are called after the destructor for their derived class. Destructors are i n the reverse 
order of their declaration. 
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Arguments to base class constructors can be specified like this: 

class A { A(int); }; 
class B { B(lnt); }; 
class C : A , virtual B { 

C(int a, int b) : A(a) , B(b) { ... } 

>; 


Constructors are executed in the order their objects are declared. Hus rule is applied to members and 
base classes separately and the base class constructors and applied before the member constructors. 
When a class has more than one base class «fl argument lists for its base Hass constructor must be 
qualified with the name of foe base class. Hus rule applies even if only one of foe base classes actually 
requires arguments. 

A virtual base is constructed before any of its derived classes. Virtual bases are constructed before any 
non-virtual bases and in foe order they appear on a depth first left-to-right traversal of foe inheritance 
DAG (directed acyclic graph). Hus rule applies recursively for virtual bases of virtual bases. 

A virtual base is initialized by the "most derived" class of which it is a base. For example: 

class V { public: V(); V(lnt); /* ... */ }; 

class A : public virtual V { public: A(); A (int); /* ... */ }; 

class B : public virtual V { public: B(); B(lnt); /* ... */ ); 

class C : public A, public B { public: C(); C(int); /* ... */ }; 


A: :A(int 

1) 

: 

V(i) { 

/*...*/) 

B: :B(lnt 

1) 

( 

/* ... 

V ) 

C::C(lnt 

1) 

( 

/* ... 

*/ ) 


V v (1); 

// use V(int) 

A a (2); 

// use V(lnt) 

B b(3); 

// use V() 

C c (4); 

// use V () 


The order of destructor calls is defined to be the reverse order of appearance in foe Hacc declar ation 
(members before bases). There is no way for the programmer to control this order — except "by the 
declaration order. A virtual base is destroyed after all of its derived Hawc 

Assignment to this in the constructor of a class that takes part in a multiple inheritance lattice is likely 
to lead to disaster. See Chapter 1 for alternatives. 
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Visibility 


The examples above ignored visibility considerations. A base class may be public or private. In addi¬ 
tion, it may be virtual. For example: 

class D 

: B1 // 

, virtual B2 // 

, public B3 // 

, public virtual B4 { 

U ... 

}; 

Note that a visibility or virtual specifier applies to a single base class only. For example, 
class C : public A, B { ... >; 
declares a public base A and a private base B. 


private (by default), non-virtual (by default) 
private (by default), virtual 
public, non—virtual (by default) 


Overheads 


The overhead in using this scheme is: 

1. one subtraction of a constant for each use of a member in a base class that is included as the 
second or subsequent base 

2. one word per function in each vtbl (to hold foe delta) 

3. one memory re ference and one subtraction for each call of a virtual function 

4. one memory reference and one subtraction for access of a base class member of a virtual base 
class 

Note that overheads [1] and [4] are only incurred where multiple inheritance is actually used, but 
overheads [2] and [3] are incurred for each class with virtual functions and for each virtual function 
call even when multiple inheritance is not used. Overheads [1] and {41 are only incurred when 
members of a second or subsequent base are accessed "from foe outside"; a member function of a vir¬ 
tual base does not incur special overheads when accessing members of its class. 

This implies that except for [21 and [31 you pay only for what you actually use; [21 and [31 impose a 
minor overhead on the virtual function mechanism even where only single inheritance is used. This 
latter overhead could be avoided by using an alternative implementation of multiple inheritance, but I 
don't know of such an implementation that is also faster in foe multiple inheritance c a se and as port¬ 
able as the scheme described here. 

Fortunately, these overheads are not significant. The time, space, and complexity overheads unposed 
on the compiler to implement multiple inheritance are not noticeable to the user. 
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But is it Simple to Use? 

What makes a language facility hard to use? 

1. Lots of rules. 

2. Subtle differences between rules. 

3. Inability to automatically detect common errors. 

4. Lack of generality. 

5. Deficiencies. 

The first two ca ses lead to difficulty of learning and remembering, causing bugs due to misuse and 
misunderstanding. The last two cases cause bugs and confusion as the programmer tries to circum¬ 
vent the rules and "simulate" missing features. Case [3] causes frustration as the programmer discov¬ 
ers mistakes the hard way. 

The multiple inheritance scheme presented here provides two ways of extending a class's name space: 

■ abase class 

■ a virtual base class 

These are two ways of aeating/spedfying a new class rather than ways of creating two different kinds 
of '■fogcpc The rules for using the resulting classes do not depend on how the name space was 
extended: 

■ ambiguities are illegal 

■ rules for use of members are what they were for single inheritance 

■ visibility rules are what they were for singTe inheritance 

■ initialization rules are what they were for single inheritance 
Violations of these rules are detected by the compiler. 

In other words, the multiple inheritance scheme is only more complicated to use than the existing sin¬ 
gle inheritance scheme in that 

■ you can extend a class's name space more than once {with more than one base class) 

■ you can extend a class's name space in two ways rather than in only one way 

This appears minimal and constitutes an attempt to provide a formal and (comparatively) safe set of 
mechanisms for observed practices and needs. I think that the scheme described here is "as simple as 
possible, but no simpler." 

A potential source of problems exists in die absence of "system provided back-pointers" from a virtual 
base class to its enclosing object 

In some contexts, it might also be a problem that pointers to sub-objects are used extensively. This 
will affect programs that use explicit casting to non-object-pointer types (such as char*) and "extra 
linguistic" tools (such as debuggers and garbage collectors). Otherwise, and hopefully normally, all 
manipulation of object pointers follows the consistent rules explained previously and is invisible to the 
user. 
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Conclusions 


Multiple inheritance is reasonably simple to add to C++ in a way that makes it easy to use. Multiple 
inheritance is not too hard to implement, since it requires only very minor syntactic extensions, and 
fits naturally into the (static) type structure. The implementation is very efficient in both time and 
space. Compatibility with C is not affected. Portability is not affected. 
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Footnotes 


1. An earlier version of this paper was presented to the European UNIX Users' Group conference 
in Helsinki, May 1987. This paper has been revised to match the multiple inheritance scheme 
that was arrived at after further experimentation and thought For more information see 'The 
Evolution of C++: 1985*1987" and "What is 'Object-Oriented Programming?'." 

2. In most of this paper data hiding issues are ignored to simplify die discussion and shorten the 
examples. This makes some examples illegal. Changing the word class to struct would make 
the examples legal, as would adding public specifiers in the appropriate places. 

3. This definition is equivalent except for possible side effects in constructors and destructors 
(access to global variables, input operations, output operations, etc.). 
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Type-safe Linkage for C++ 


NOTE 


This chapter is taken directly from a paper by Bjame Stroustrup. 


Abstract 


This paper describes the problems involved in generating names for overloaded functions in C++ and 
in linking to C programs. It also discusses how these problems relate to library building. It presents a 
solution that provides a degree of type-safe linkage. This eliminates several classes of errors from C++ 
and allows libraries to be composed more freely than has hitherto been possible. Finally the current 
encoding scheme for C++ names is presented. 


Introduction 


This paper describes the type-safe linkage scheme used by the 2J0 release of C++ and the mechanism 
provided to allow traditional (unsafe) linkage to non-C++ functions. It describes the problems wife 
the scheme used by previous releases, the alternative solutions considered, and the practicalities 
involved in converting from the old linkage scheme to foe new. 

The new scheme makes the overload keyword redundant, simplifies foe construction of tools operating 
on C++ object code, makes the composition of C++ libraries simpler and safer, and enables reliable 
detection of subtle program inconsistencies. The scheme does not involve any run-time c os ts and does 
not appear to add measurably to compile and link time. 

The scheme is compatible with older C++ implementations for pure C++ programs but requires expli¬ 
cit specification of linkage requirements for tankage to non C++ functions. 


The Original Problem 


C++ allows overloading of function names; that is, two functions may have foe same name provided 
their argument types differ sufficiently for foe compiler to tell them apart For example, 

double sqrt (double) ; 
conplex sqrt (couples) ; 

Naturally, these functions must have different names in the object code produced from a C++ pro¬ 
gram. This is achieved by suffixing the name the user chose with an encoding of the argument types 
(the signature of the function). Thus foe names of foe two sqrtO functions become: 

sqrt Fd // the sqrt that takes a double argument 

sqrt F7ocBplex // the sqrt that takes a conplex argument 

Some details of the encoding scheme are described under "The Function Name Encoding Scheme." 
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When experiments along this line began five years ago it was immediately noticed that for many sets 
of overloaded functions there was exactly one function of that name in the standard C library. Since C 
does not provide function name overloading there could not be two. It was deem ed essential for C++ 
to be able to use the C libraries without modification, recompilation, or indirection. Thus the problem 
became to design an overloading facility for C++ that allowed calls to C library functions such as sqrtO 
even when foe name sqxt was overloaded in foe C++ program. 


The Original Solution 


The solution, as used in aU non-experimental C++ implementations up to now, was to let foe name 
generated for a C++ function be foe same as would be generated for a C function of foe same name 
wherever possible. Thus openO gets foe name open on systems where C doesn't modify its names on 
output, foe name .open on systems where C prepends an underscore, etc. 

This simple scheme dearly isn't sufficient to cope with overloaded functions. The keyword overload 
was introduced to distinguish foe hard case from foe easy one and also because function name over¬ 
loading was considered a potentially dangerous feature that should not be accidentally or implidtly 
applied. In retrospec t this was a mistake. 

To allow linkage to C functions the rule was introduced that only foe second and subsequent version 
of an overloaded function had their names encoded. Thus foe programmer would write 

overload aqrt; 

double sqrt(double); // aqrt 

complex sqrt (conplex); // sqrt r7ccnplex: 

and foe effect would be that foe C++ compiler generated code referring to sqrt and sqxt F7complex 
TOs enabled a C++ programmer to use foe C libraries. This trick solves foe problems <rf name encod¬ 
ing, linkage to C and protection against accidental overloading, but it is dearly a hack. Fortunately it 
was only documented in foe BUGS section of the C++ manual page. 


Problems with the Original Solution 

There are at least three problems with this scheme: 

■ how to name overloaded functions so that one may be a C function 

■ how to detect errors caused by inconsistent function declarations 

■ how to spedfy libraries so that several libraries can be easily used together 

The overload Linkage Problem 

Consider a program that uses an overloaded function printO to output globs and widgets. Naturally 
globs are defined in globJi and widgets in widgeth. A user writes 
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// filel.c: 

♦include <glob.h> 

♦i n clud e <widget.h> 

but this elicits an error message from the C++ compiler since printO is declared twice with different 
argument types. The user then modifies the program to read 

// filel.c: 
overload print; 

♦include <glob.h> 

♦include <widget.h> 

and all is well until someone in some other part of the program writes 

// file2.c: 
overload print; 

♦include Bridget.h> 

♦include <glob.h> 


This fails to link since filelx's output refers to print (meaning prinfiglob)) and print_ _F6widget, 
whereas file2x's output refers to print (meaning print(widget)) and print__F4glob. 

This is of course a nuisance, but at least the program fails to link and the p ro gramm er can — after 
some detective work based on relatively uninformative linker error messages — fix the problem. The 
nastier variation of this will happen to the conscientious progr amm er who knows that printO is over¬ 
loaded and inserts the appropriate overload declarations, but happens to use only one variation of 
printO in each of two source files: 

// filel.c: 
overload print; 

♦include <glob.h> 

// file2.c: 
overload print; 

♦Include <widget.h> 

The output from filelx and file2x now both refer to print. Unfortunately, in the output from filel-c 
print means prinHglob) whereas print refers to print!widget) in the output from fiU»2-c One might 
expect linkage to fail because printO has been defined twice. However, on most systems this is not 
what happens in the important case where the definitions of print(glob) and prinKwidget) are placed 
in libraries. Then, die linker simply picks die first definition of printO it encounters and ignores die 
second. The net effect is that calls (silendy) go to the wrong version of printO. If we are lucky, die 
program will fail miserably (core dump); if not, we will simply get wrong results. 

The requirement that die overload keyword must be used explicitly and die non-uniform treatment of 
overloaded functions ("the first overloaded function has C linkage") is a cause of complexity in C++ 
compilers and in other tools that deal with C++ pr o gra m text or with object code generated by a C++ 
compiler. 
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The General Linkage Problem 

This problem of inconsistent linkage is a variation of the general problem that C provides only the 
most rudimentary facilities for ensuring consistent linkage. For example, even in ANSI C and in C++ 
(until now) foe following example will compile and link without warning: 

♦include <stdio.h> 
extern int sqrt(1st); 


nainO 

( 

printf ("sqrt (%d) — %d\jn",2,sqrt (2)); 

) 

and produce output like this 
sqrt (2) — 0 

because even though the user dearly specified that an integer sqrtO was to be used, the C 
compiler/linker uses foe double precision floating point sqrtO from foe standard library. This problem 
can be handled by consistent and comprehensive use of correct and complete header files. However, 
that is not an easy thing to achieve reliably and is not standard practice. The traditional C and C++ 
compiler/linker systems do not provide the programmer with any help in detecting errors, oversights, 
or dangerous practices. 

These linkage problems are especially nasty because they increase d is pr o portionately with foe size of 
programs and with foe amount of library use. 

Combining Libraries 

The standard header complexJi overloads sqrtO: 

// coupler.h: 
overload sqrt; 

♦include <3nath.h> 
coupler sqrt (coupler) ; 

Some other header, 3<Lh, declares sqrtO without overloading it 
// 3d.h: 

♦include <3nath.h> 

Now a user wants both the 3d and the complex number packages in a program: 

♦include <3d.h> 

♦include <ccuplex.h> 

Unfortunately this does not compile because this sequence of operations: 
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double sqrt (double); 
overload sqrt; 


// from <3d.h> 

// from <math.h> via <conplex.h> 


A function must be overloaded before its first declaration is processed. So the p rogr amm er, who 
really did not want to know about the internals of those headers, must reorder the ffindude directives 
to get the program to compile: 

♦include -complex.h> 

♦include <3d.h> 

This will work unless 3dh overloads some function, say atanO, that complexli does not Even in that 
case the programmer can cope with the problem by adding sufficient overload declarations where 3dJh 
and complexJi are included: 

overload sqrt; 
overload atan; 

♦include <3d.h> 

♦include -complex.h> 

This reordering and/or adding of overload declarations is work that is really quite spurious and in 
any case irrelevant to the job the programmer is trying to do. Worse, if the e x tra overload declara¬ 
tions were placed in a header file the programmer has now set the scene for the users of toe new pack¬ 
age to have exactly the same problems when they try combining this new library with other libraries. 

It becomes tempting to overload all functions or at least to provide header files that overload all 
interesting functions. This again defeats any real or imagined benefits of requiring explicit overload 
declarations. 



A General Solution 


The overloading scheme used for C++ (until now) interacts with the traditional C linkage scheme in 
ways that bring out toe worst in both. Overloading of function names that was introduced to provide 
notational convenience for programmers is becoming a noticeable source of extra work and complexity 
for builders and users of libraries. Either the idea of overloading is bad or else its implementation in 
C++ is deficient The insecure C linkage scheme is a source of subtle and not-so-subtle err ors . In sum¬ 
mary: 

■ lade of type checking in the linker causes problems 
a use of toe overload keyword causes problems 

■ we must be able to link C++ and C program fragments 

A solution to 1 is to augment toe name of every function with an encoding of its signature. A solution 
to 2 is to cease to require the use of overload (and eventually abolish it completely). A solution to 3 is 
to require a C++ programmer to state explicitly when a function is supposed to have C-style linkage. 

The question is whether a solution based on these three premises can be implemented without notice¬ 
able overhead and with only minimal inconvenience to C++ programmers. The ideal solution would 
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a require no C++ language changes 

■ provide type-safe linkage 

■ allow for simple and convenient linkage to C 
a not break existing C++ code 

a allow use of (ANSI style) C headers 
a provide good error detection and error reporting 
a be a good tool for library building 
a not impose run-time overhead 
a not impose compile time overhead 

We have not been able to devise a scheme that fulfills all of these criteria strictly, but the adopted 
scheme is a good approximation. 

Type-safe C++ Linkage 

First of all, every C++ function name is encoded by appending its signature. This ensures that a pro¬ 
gram will only load provided every function that is called has a definition and that the type specified 
at tiie point of call is the same as the type specified at the point of definition, for example, given: 

f(iat i) { } // f Pi 

f(int i, char* j) { ... ) // f__FlPc 

These examples will cause correct linkage: 

extern f(int); // f_Fi - links to f(int) 

f (lK- 


extem £(lnt,char*); // f_FiPc - links to f(lnt,char*) 

f (1, "asdf") ; 

These examples will cause linkage errors independent of where in the program they occur because no 
10 with a suitable signature has been defined: 

// no declaration of f() in this file 
// ( thi s is only legal in C pr og ram s) 
f(l); // f - links to ??? 

e xtern f (char*); // f FPc - links to ??? 

f ("asdf") ; 

extern f (int ...);// f_Fie - links to ??? 
f (1« "asdf"); 


One might consider extending this encoding scheme to indude global variables, etc, but this does not 
appear to be a good idea since that would introduce at least as many problems as it would solve, for 
example: 
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// filel.c: 
lot aa - 1; 
extern int bb; 


//file2.c: 

char* aa - "asdf"; // error: aa Is declared int in filel.c 
extern char* bb; // error: bb is declared int in filel.c 

Under the current C scheme, the double definition of aa will be caught and the inconsistent declara¬ 
tions of bb will not Using an encoding scheme, the double definition of aa would not be caught since 
the difference in encoding would cause two differently named objects to be created — contrary to the 
rules of C and C++. The fact that the inconsistent declarations of bb would be caught by linkers 
(not all) does not compensate for the incorrect linkage of aa. Consequently only functions are <***vi«*1 
using heir signatures. 

This linkage scheme is much safer than what is currently used for C, but it is not meant to solve all 
linkage problems. For example, if two libraries each provides a function KinD as part of their public 
interface there is no mechanism that allows foe compiler to detect that there are supposed to be two 
different f(int)s. If the .o files are loaded together foe linker will detect foe error, but where a library 
search mechanism is employed foe error may go undetected. 

Note that this linking scheme simply enforces the C++ rules that every function must be declared 
before it is called and that every declaration of an external name in C++ must have exactly foe seme 

type- 

In essence, we use the name encoding scheme to "trick" foe linker into doing type checking of the 
separately compiled files. More comprehensive solutions can be achieved by modifying the linker to 
understand C++ types. For example, a linker could check the types of global data objects and might 
also be able to provide features for ensuring the consistency of global constants and desses. However, 
getting an improved linker into use is typically a hard and slow process. The scheme presented here 
is portable across a great range of systems and can be used immediately. 

Implicit Overloading 

If a function is declared twice with different argument types it is overloaded. For example: 

double sqrt(double); 
conplex sqrt (ccnplex) ; 

is accepted without any explicit overload declaration. Naturally, overload declarations will be 
accepted in the foreseeable future; they are simply not necessary any more. 

Does this relaxation of the C++ rules cause new problems? It does not appear to be the case. For 
example, originally I imagined that obvious mistakes such as 

double sqrt (double); // sort Fd 

double d - sqrt(2.3); 

double sqrt (int d) ( ... } // sqrt FI 

would cause hard-to-find errors. It certainly would with the traditional C linkage rules, but with 
type-safe linkage the program simply will not link because there is no function called sqrt_ Fd 
defined anywhere. Even the standard library function will not be found because its namels sqrt as 
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always. 

Another imagined problem was that a call 
f (x); 

would suddenly change its meaning when a function became overloaded by the inclusion of a new 
header file containing the declaration of another function fO. This is not the case, because the C++ 
ambiguity rules ensure that the introduction of a new fO will either leave die meaning of f(x) 
unchanged (the new f() was unrelated to the type of x) or will cause a compile time error because an 
ambiguity was introduced. 

C Linkage 

This leaves the problem of how to call a C function or a C++ function "masquerading" as a C func¬ 
tion. To do this a programmer must state that a function has C linkage. Otherwise, a function is 
assumed to be a C++ function and its name is encoded. To express this an extension of the "extern" 
declaration is introduced into C++: 

extern "C" { 

double sqrt(double); // aqrt(double) has C linkage 

) 


This linkage specification does not affect the semantics of the program using sqrtO but simply tells the 
compiler to use the C naming conventions for the name used for sqrtO in the object code. This means 
that the name of this sqrtO is sqrt or _sqrt or whatever is required by the C linkage conventions on a 
given system. One could even imagine a system where the C linkage rules were die type-safe C++ 
linkage rules as described above so that the name of sqrtO was sqrt_ _Fd. Linkage sp e cificati on s nest, 
so that if we had other linkage conventions such as Pascal linkage wecould write: 


extern "C* ( 

extern "Pascal" ( 

extern "C++" { 

> 

) 

) 


// default: C++ linkage here 
// C linkage here 
// Pascal linkage here 
// C++ linkage here 
// Pascal linkage here 
// C linkage here 
// C++ linkage here 


Such nestings will typically only occur as the result of nested #indudes. 

The {} in a linkage specification does not introduce a new scope; the braces are simply for group¬ 
ing. This strongly resembles the use of {} in enumerations. 
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The keyword extern was used because it is already used to specify linkage in C and C++. Strings (for 
example, "C" and "C++") were chosen as linkage specifiers because identifiers (e.g., C and Cplusplus) 
would de facto introduce new keywords into the language and because a larger alphabet can be nsed 
in strings. 

Naturally, only one of a set of overloaded functions can have C linkage, so the following causes a com¬ 
pile time error 

extern "C" { 

double sqrt(double); 
conplcx sqrt (complex) ; 

> 

Note that C linkage can be used for C++ functions intended to be called from C p ro gram s as well as 
for C functions. In particular, it is necessary to use C linkage for C++ functions written to implement 
standard C library functions for use by C programs. However, using the enooded C++ name from C 
preserves t y p e safe ty at link time. This tedmique can be valuable in other languages too. I have 
already seen an example of the C++ scheme applied to assembly code to prevent nasty link errors for 
low level routines. One might consider using this C++ linkage scheme for C also, but I s uspe ct that 
the sloppy use of type information in many C programs would make that too painful. 

In an "all C++" environment no linkage specifications would be needed. The linkage merhaniqn is 
intended to ease integration of C++ code into a multi-lingual system. 

Caveat 

One could extend this linkage specification mechanism to other languages such as Fortran, Lisp, Pas¬ 
cal, PL/1, etc. The way such an extension is done should be considered very carefully because one 
"obvious" way of doing it would be to build into a C++ compiler the full knowledge of the type struc¬ 
ture and calling conventions of such "foreign" languages. For example, a C++ compiler might handle 
conversion of zero terminated C++ strings into Pascal strings with a length prefix at the call point of 
function with Pascal linkage and might use Fortran call by reference rules when calling a function with 
Fortran linkage, etc 

There are serious problems with this approach: 

■ The complexity and s p ee d of a C++ compiler could be seriously affected by such exte nsions 

■ Unless an extension is widely available, accepted programs using it will not be portable. 

■ Two implementations might "extend" C++ with a linkage specification to the same "foreign" 
language, say Fortran, in different ways so as to make identical C++ programs have subtly dif¬ 
ferent effects on different implementations. 

Naturally, these problems are not unique to linkage issues or to this approach to linkage specification. 

I conjecture that in most cases linkage from C++ to another language is best done simply by using a 
common and fairly simple convention such as "C linkage" plus some standard library routines and/or 
rules for argument passing, format conversion, etc, to avoid building knowledge of non-standard cal¬ 
ling conventions into C++ compilers. This ought to be simpler from C++ than from ire>st other 
languages. For example, reference type arguments can be used to handle Fortran argument passing 
conventions in many cases and a Pascal string type with a constructor taking a C style string can trivi¬ 
ally be written. Where extensions are unavoidable, however, C++ now provides a standard syntax for 
expressing them. 
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Experience 

The natural first reaction to this scheme is to look for a way of handling linkage and overloading 
without requiring explicit linkage specifications. We have not been able to come up with a system that 
enabled C linkage to be implicit without serious side effects. I will summarize the advantages of the 
adopted scheme here and discuss several possible objections to it "Alternative Solutions" below 
describes alternative schemes that were considered and rejected. 

Making Linkage Specifications Invisible 

One obvious advantage of this scheme is that it allows a programmer to give a set of functions C link¬ 
age with a single linkage specification without modifying the individual function declarations. This is 
particularly useful when standard C headers are used. Given a C header (that is, an ANSI C header 
with function prototypes, etc) 

// C header: 

// C declarations 

one can trivially modify the header for use from C++: 

// C++ header: 

extern "C* { 

// C header: 

// C declarations 

> 

This creates a C++ header that cannot be shared with C 
Sharing with C can be achieved using #ifdef: 


// C and C++ header: 

fifdef 
extern 
fendif 


fifdef 
} 

fendif 

where _ cplusplus is defined by every C++ compiler. 

In raw where one for some reason cannot or should not modify the header itself one can use an 
indirection: 


cplusplus 
•C- { 

// C h ea der : 

// C declarations 
cplusplus 
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// C++ header: 

extern "C" { 

♦include "C_header" 

} 

Fortunately, such transformations can be done by trivial programs so that most of the effort in convert¬ 
ing C headers need not be done by hand. 

It was soon discovered that even though programmers tend to scatter function declarations throughout 
the C++ pro gr a m text, most C functions actually come horn well defined C libraries for which there 
are — or ought to be — standard header files. 

Placing all of foe necessary linkage specifications in standard header files means that they are not seen 
by most users most of foe time. Except for programmers studying foe details of C library interfaces, 
p ro g r am mer s installing headers for new C libraries for C++ users, and programmers providing C++ 
implementations for C interfaces, foe linkage specifications are invisible. 


Error Handling 

The linker detects errors, but reports them using the names found in foe object code This can be com¬ 
pensated for by adding knowledge about foe C++ naming conventions to foe linker or (simpler) by 
providing a filter for processing linker error messages. This output was produced by such a filter. 

C++ synbol mapping: 

PathListHead:: -PathListHead () 

Path_list:: sepWork () 

Path: :pathnonn{) 

Path::operators (Paths) 

Path:: first () 

Path:: last () 

Path:: zmfirst {) 

Path: :rmlast () 

Path: :xandots() 

Path::findpath(Strings) 

Path: :fullpath() 

Bringing this filter into use had the curious effect of replacing foe usual complaint about "ugly C++ 
names" with complaints that foe linker didn't provide sufficient information about C functions and 
global data objects. 

The reason for presenting foe encoded and unencoded names of undefined functions side by side is to 
help users who use tools, such as debuggers, that haven't yet been converted to understand C++ 
names. 

A plain C debu gger such as sdb, dbx, or codeview can be used for C++ and will correctly refer to foe 
C++ source, but it will use foe encoded names found in foe object code. This can be avoided by 
employing a routine that "reverses" foe encoding, that is, reads an encoded name and extracts infor¬ 
mation from it 1 The encoding scheme is described under "The Function Name Encoding Scheme." A 
standard C++ name decoder should be generally available for use by debugger writers and others who 
deal directly with object code. Until such decoders are in widespread use the programmer must have 
at least a minimal understanding of foe encoding scheme. 


dt 12PathListHeadFv 
acPWork SPath listTv 
pathnorm 4PathFv 
_ad_4PathFR4Path 
first 4PathFv 
last 4PathFv 
nnfirst 4PathFv 
rmlast 4PathFv 
rmdots 4PathFv 
findpath 4PathFR6String 
fullpath 4PathFv 
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Upgrading Existing C++ Programs 

Decorating the standard header files with the appropriate linkage specifications had two effects. The 
first phenomenon observed was that most of the declarations scattered in the program text that were 
referring to C functions were either redundant (because the function had already been declared in a 
header) or at least potentially incorrect (because drey differed from the declaration of that header file 
on some commonly used system). The second phenomenon observed was that every non-trivial pro¬ 
gram converted to the new linkage system contained inconsistent function declarations. A noticeable 
number of declarations found in die program text were plain wrong, that is, different from the ones 
used in die function definition. This was caused in part by sloppiness, for example, where a program¬ 
mer had declared a function 

f(int ...); 

to shut up the compiler instead of looking up die type of the second argument. A more common 
problem was that the "standard" header files had changed since the function declaration was placed in 
the text so that the 'local" declaration didn't match any more; this often happens when a file is 
transferred from one system to another, say from a BSD to a System V. 

In summary, introducing the new linkage system involved adding linkage specifications. Typically, 
these linkage specifications were only needed in standard header files. The process of introducing 
linkage specifications invariably revealed er rors in the programs — even in programs that had been 
considered correct for years. The process strongly resembles trying lint on an old C program. 

As was expected, some programmers first tried to get around the requirements for explicit C li ning* 
by enclosing their entire program in a linkage directive. This might have been considered a fine way 
of converting old C++ progra ms with minimum effort had it not had the effect of ensuring that every 
program that uses facilities provided by such a program would also have to use the unsafe C linkage. 
To achieve the benefits from the new linkage scheme most C++ programs must use it The require¬ 
ment that at most one of a set of overloaded functions can have C linkage defeats this way of convert¬ 
ing programs. The slightly slower and more involved method of using standard header files (already 
containing the necessary linkage specifications) and adding a few extra linkage specifications in local 
headers where needed must be used. This also has the benefit of unearthing unexpected errors. 


Details 


The scope of C function declarations has always been a subject for debate. In the context of C++ with 
linkage specifications and overloaded functions it seem s prudent to answer some variations of the 
standard questions. 

Default Linkage 

Consider 
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extern "C" { 

int f (int) ; 

) 


int f(int); // default: f() has C++ linkage 


Is it the same fO that was defined with C linkage above and does it have C or C++ linkage? It is the 
same f() and it does (still) have C linkage. The first linkage specification "wins" provided the second 
declaration has "only" default (that is, C++) linkage. 

Where linkage is explicitly specified for a function, that specification must agree with any previous 
linkage. For example: 


extern "C" { 

int f(int); // f () has C linkage 

) 


int g(); 


// default: g() has C++ linkage 


extern "C++" { 

int f(int); // error: i nc onsi s tent linkage specification 

int g(); // fine 


The reason to require agreement of explicit linkage specifications is to avoid unnecessary order depen* 
dendes. The reason to allow a second declaration with implicit C++ linkage to on the linkage 
from a previous explicit linkage specification is to cope with the common case where a declaration 
occurs both in a x file and in a standard header file. 

Declarations in Different Scopes 

Consider. 


extern "C" ( 

int f(int); 

1 


void gl () 

{ 

int f(int); 
f (1) ; 


Is the K) declared local to gl the same as the global fO and does the function called in glO have C 
linkage? It is the same fO and it does have C linkage. 

Consider 
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extern "C“ { 

int f(int); 

1 


void g2 () 

{ 

int f(char*); 

£( 1 ); 

fCasdf); 

) 

Does the local declaration of fO overload the global fO or does it hide it? In other words, is die call 
f<l) legal? That call is an error because die local declaration introduces a new fO. In the tradition of 
C, the declaration of f(char*) also draws an warning. 

Consider 

void g3() 

1 

int ff(int); 

1 ; 


void g4 () 

{ 

int fffchar*); 
ff("asdf"); 
ff (Dr- 


Does the second declaration of £f0 overload the first? In other words, is die call ff(l) legal ? The call is 
an error and a warning is issued about the two declarations of ffO because (as in the above) 

overloading in different scopes is considered a likely mistake. 

Local Linkage Specification 

Linkage specifications are not allowed inside function definitions. For example: 

void g5() 

{ 

extern "C" ( // error: linkage specification in function 

int h(); 

1 

1 

The reason for this restriction is to discourage die use of local declarations of C functions and to sim¬ 
plify the language rules. 
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Alternative Solutions 


So, the hnkage specification scheme works, but isn't there a better way of achieving the benefits of that 
scheme. Several schemes were considered. This section presents the first two or three alternatives 
people usually come up with and explains why we rejected them. Naturally, we also considered more 
and weirder solutions, but all the plausible ones were variations of the ones presented here. 


The Scope Trick 

The first attempt to provide type-safe linkage involved the use of overload and die C++ scope rules. 
All overloaded function names were encoded, but non-overioaded function names were not This 
scheme had the benefit that the linkage rules for most functions were the C linkage rules — and had 
die problem that those rules are unsafe. The most obvious problem was that at first there is no 
way of linking an overloaded function to a standard C library function. This proNemwas handled 
using a "scope trick": 


overload sqrt; 
complex sqrt (complex) ; 
inline double sqrt (double d) 

( 

extern double sqrt (double); // A completely new sqrt() 

// not overloaded 

return sqrt (d); //not a recursive call 

// but a call of the C functio n 
// sqrt 

1 

In effect, we provided a C++ calling stub for die C function sqrtO. The snag is that having thus defined 
sqrtfdouble) in a standard header a user cannot provide an alternative to the standard version. The 
problems with library combination in the presence of overload are not addressed in this i ’I m w m and 
are actually made worse by the proliferation of definitions of overloaded functions in header files. In 
particular, if two "standard" libraries each overload a function then these two libraries cannot be used 
together since that function will be defined twice: once in each of the two standard headers. 

There is also a compile time overhead involved. In retrospect, I consider this scheme somewhat worse 
than toe original "the first overloaded has C linkage" s chem e. 

C "Storage Class” 

It is dear that toe definitions providing a calling stub are redundant We could simply provide a way 
of stating that a member of a set of overloaded functions should be a C function. For example: 

complex sqrt (complex); 

cdecl double sqrt (double); // sqrt (double) has C 

This is equivalent to 


Type-Safe Linkage for C++ 


6-15 




Type-safe Linkage for C++ 


conplex sqrt (conplex) ; 
extern "C" { 

double sqrt(double); 

) 



but less ugly. However, it involves complicating the C++ language with yet another keyword. Func¬ 
tions from other languages will have to be called too and they each have separate requirements for 
linkage so the logical development of this idea would eventually make ada, fortran, lisp, pascal, etc., 
keywords. Using a keyword also requires modification of the declarations of the C functions and 
those are exactly the declarations we would want not to touch since they will typically live in header 
files shared with an ANSI C compiler. In some c as es we would even like not to touch a file in which 
such declarations reside. 


Overload “Storage Class” 

The use of a keyword to indicate that a function is a C function is logically very similar to the linkage 
specification solution, though inferior in detail. An alternative is to have a keyword indicate that a 
function should have its signature added. The keyword overload might be used. For example: 


overload conplex sqrt (conplex); // use C++ Hr»fr«g » 
double sqrt (double); // C linkage by default 


This has the disadvantage that the programmer has to add information to gain type safety rafter than 
having it as default and would de facto ensure that the C++ type-safe i»«Vag «» rules would only be 
used for overloaded functions. Furthermore, this would mean that libraries could only be combined if 
the designers of these libraries had decorated all the relevant functions with overload. This scheme 
also invalidates all old C++ programs without providing significant benf»fits 

Calling Stubs 


One way of dealing with C linkage would be not to provide arty facilities for it in the C++ language, 
but to require every function called to be a C++ function. To achieve this one would simply re¬ 
compile all libraries and have one version for C and another for C++. This is a lot of work, a lot of 
waste, and not feasible in general. In the cases where recompilation of a C program as a C++ program 
is not a reasonable proposition (because you don't have the source, because you cannot get the pro¬ 
gram to compile, because you don't have the time, because you don't have the file space to hold the 
result, etc) you can provide a small dummy C++ function to call theC function. Such a function 
would be written in C (for portability) or in assembler (for efficiency). For 


double sqrt Fd (d) double d; /* C stub for sqrt (double): */ 

{ 

extern double sqrt (); 
return sqrt(d); 

) 


A program can be provided to read the linker output and produce the required stubs. 


This scheme has the advantage that the user works in what appears to be an "all C++" environment 
(but so does the adopted scheme once a few C libraries have been recompiled with C++ and/or a few 
header files have been decorated with linkage specifications). It does, however, also suffer from a few 
severe disadvantages. A "C calling stub maker" program cannot be written portably. Therefore, it 
would become a bottleneck for porting C++ implementations and C++ programs and thus a bottleneck 
for the use of C++. It is also not clear that this approach can be implemented everywhere without loss 
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of efficiency since it requires large numbers of functions to have two names (a C name and a C++ 
name). This takes up code space and introduces large numbers of extra names that would slow down 
programs reading object files such as linkers, loaders, debuggers, etc. The C calling interfaces would 
also be ubiquitous and available for anyone to use by mistake, thus re-introducing die C linkage prob¬ 
lems in a new guise. r 

Encode Only C++ Functions 

The fundamental problem with all but the last scheme outlined above is that they require the program¬ 
mer to decorate the source code with directives to help the compiler determine which functions are C 
functions. Ideally, the compiler would simply look at the program and determine the linkage neces¬ 
sary for each individual function based on its type. Could the compiler be that smart? Unfortunately, 
no. There is no way for the compiler to know whether 

extern double sqrt(double); 

is written in C or C++. However, one might handle most cases by the heuristic that if a function is 
clearly a C++ function it gets C++ linkage and if it isn't it gets C linkage. For 

conplex sqrt (couplex); // clearly C++: sqrt F7ccnplex 

double sqrt (double); // could be C:sqrt 

Since complex is a class, sqrticomplex) is clearly a C++ function and it is encoded. The other sqitO 
might be C so it isn't 

Applying this heuristic would mean that most functions would not have type-safe linkage — but we 
are used to that It would also mean that overloading a function based on two C types would be 
impossible or require special syntax: 

int max(lnt,int); 
double max(double,double) ; 

Such overloading must be possible because there are many such examples and several of those are 
important, especially what support for both single and double precision floating point arithmetic 
becomes widespread: 

float sqrt (float) ; 
double sqrt (double); 


This implies that either overload or linkage specifications must be introduced to handle such c are 
The heuristic nature of the specification of where these directives are needed will lead to confusion, 
overuse, and errors. 

If overload is re-introduced, the cautious progr amm er will use it systematically wherever a relatively 
simple class is used (in case a revision of the system should turn it into a plain C struct), wherever an 
argument is typedef d (because that typedef might some day refer to a plain C type), and wherever 
thee is any doubt This will lead to the now well known problems of combining libraries. Similarly, 
if linkage specifications are required anywhere, they will proliferate because of doubts about where 
they are needed. 
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It does not seem wise to refrain from checking linkage in a large number of cases and to introduce a 
rather arbitrary heuristic into the linking rules for C++ without being able to reduce the complexity of 
the language or to reduce the burden on the programmer somewhere. ^ 7 


Nothing 

Naturally, while considering these alternative schemes the easy option of doing nothing was regularly 
re-considered. However, the original scheme still suffers from the problems described m section 3: 
insecure linkage, spurious overload declarations, and overloading rules that mmp tifate the life of 
library writers and library users. 


Syntax Alternatives 


The scheme of giving all C++ functions type-safe linkage and providing a syntax for expressing that a 
given function is to have C linkage was thus chosen and tried. However, there were still several alter¬ 
natives for expressing C linkage for this ge ne ra l scheme. 


Why extern? 

Instead of employing the existing keyword extern we might have introduced a new one such as link¬ 
age or foreign. The introduction of a new keyword always breaks some programs (though usually 
not m any serious way and for a well chosen new keyword not many programs) and extern already 
has the nght meaning in C and C++. In almost all cases extern is redundant since external linkage is 
the default for global names and for locally declared functions. When used, extern simply emphasizes 
foe fact that a name should have external linkage. The use of extern introduced here merely aSows 
the programmer to tag an extern declaration with information of how that linkage is to be established. 

Linkage for Individual Functions 

One obvious alternative is to add foe linkage specification to each individual function: 


extern "C* double sqrt(double); // sqrt(double) has C linkage 


The problem with this is that it does not serve foe need to be able to give a set of C functions C link¬ 
age with one declaration and requires the declaration of every C function to be modified. In particu¬ 
lar, it does not allow a C header (that is, an ANSI C header) to be used from a C++ program STsuch a 
way that all foe functions declared in it get C linkage. K ^ 


This notation for linkage specification of individual functions is not just an alternative to foe linkage 
adopted bfo also an obvious extension to foe adopted syntax. I intend to review foe situation 
after foe current scheme has been used a while longer to see if foe use of linkage specifications war¬ 
rants this extension. 


Linkage Pragmas 

The original implementation of foe linkage specifications used a #pragma sy ntax - 
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♦pragma linkage C 
double eqrt (double); 
♦pragma linkage 


// sqrt(double) has C linkage 
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This was considered too ugly by many but did appear to have significant advantages. For example it 
can be argued that linkage to "foreign languages" is not part of the language proper. Such linkage 
cannot be specified once and for all in a language manual since it involves the impleme ntation s of two 
languages on a given system. Such implementation specific concepts are exactly what pragmas were 
introduced into Ada and ANSI C to handle The #pragma syntax was trivial to implement and easy to 
read. It was also ugly enough to discourage overuse and to encourage hiding of linkage specifications 
in header files. 

There are problems with this view, though. For example, it is most often assumed that any fpragma 
can be ignored without affecting the meaning of a program. This would not be the case with Hniogo 
pragmas. Another problem is that for the moment many C implementations do not support a pragma 
mechanism and it is not certain that those that do can be relied upon to "do foe right thing" for link¬ 
age pragmas used by a C++ compiler. 

Linkage to a particular foreign language does not belong in C++ because such linkage will in principle 
be local to a given system and non-portable However, foe fact that linkage to other languages occurs 
is a general concept that can and ought to be supported by a language intended to be used in multi¬ 
language environments. In practice, one can assume that at least C and Fortran will be available on 
most systems where C++ is used and that a large group of users will need to call functions written in 
these languages. Consequently, one would expect C++ implementations to support C and Fortran 
linkage. 

The fact that C (like most other languages) does not provide a concept of linkage to program frag¬ 
ments written in other languages led to the absence of an explicit linkage mechanism in C++ and to 
the problems of link safety and overloading. 

Special Linkage Blocks 

Another approach would be to introduce a new keyword, say linkage, and use it to specify both the 
start and foe end of a linkage Node 

linkage("C"); 

double sqrt(double); // sqrt(double) has C linkage 

linkage(""); 


This avoids introducing yet another meaning for {}, allows setting and restoring of linkage to be two 
separate operations, allows all linkage directives to be found by simple pattern matching in a line 
oriented editor, and allows all linkage directives to be suppressed by a single macro 

♦define linkage(a) 

The problem with this seems to be that it tempts people to think of linkage as a compiler "mode" that 
can be switched on and off at random times and doesn't obey Nock structure. For example: 
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linkage ("C"); 


double sqrt (double) ; 

// sqrt (double) has C linkage 

£() ( 


extern g() ; 

// g() has C linkage 

linkage ("); 


extern h(); 

// h () has C++ linkage 


} 

It also becomes hard to convince people that linkage specifications come in pairs and can be nested. 

The same approach, with the same educational problems, can be tried without introducing a new key¬ 
word: 

extern "C"; 

double sqrt(double); // sqrt(double) has C linkage 

extern 

Note that whatever syntax was chosen, linkage specifications were intended to obey block structure to 
be fit deanly into the language. In particular, if linkage "blocks" and ordinary Mocks were not 
obliged to nest the job of writers of tools manipulating C++ source text such as a C++ incremental 
compilation environment, would be needlessly complicated. 


Conclusions 


The use of function name encodings involving type signatures provides a significant improvement in 
link safety compared to C and earlier C++ implementations. It enables the (eventual) abolition of toe 
redundant keyword overload and allows libraries to be combined more freely than before. The use of 
linkage specifications enables relatively painless linkage to C and eventually to other languages as 
well. The scheme described here appears to be better than any alternative we have been able to dev¬ 
ise. 

The Function Name Encoding Scheme 

The (revised) C++ function name encoding scheme was originally designed primarily to allow the 
function and class names to be reliably e x tracte d from encoded class member names. It was then 
modified for use for all C++ functions and to ensure that relatively short encodings (less than 31 char¬ 
acters) could be achieved reliably for systems with limitations on the length of identifiers seen by toe 
linker. The description here is just intended to give an idea of toe technique used, not as a guide for 
implementors. 

The basic approach is to append a function's signature to the function name. The separator_is used 

so a decoder could be confused by a name that contained_except as an initial sequence, so don't use 

names such as a_b_c in a C++ program if you like your debugger and other took to be able to 

decompose the generated names. 

The encoding scheme is designed so that it is easy to determine 
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■ if a name is an encoded name 

■ what (unencoded) name the user wrote 

■ what class (if any) the function is a member of 

■ what are the types of the function arguments 

The basic types are encoded as 


'void v 
char c 
short s 
lot i 
long 1 
float f 
double d 
long double r 


e 


A global function name is encoded by appending_F followed by the signature so Oat 

ftintchar^ouble) becomes f_Ficd. Since fO is equivalent to Avoid) it becomes f Fv. 

Names of classes are encoded as the length of the name followed by the name itself to avoid termina¬ 
tors. For example, x=f<) becomes f_ _lxFv and recmpdatednt) becomes update. _3iecFi. 

Type modifiers are encoded as 


unsigned 0 
const C 
volatile V 
signed S 


so ^unsigned) becomes f_ _FUi. If more than one modifier is used they will appear in al ph a bet ical 
order so f(const signed char) becomes f_ FCSc. 

The standard modifiers are encoded as 

pointer * P 

reference «R 

array [10]A10_ 

function () F 

ptr to member S::*KLS 

So flchar*) becomes f__FPc and printficonst char* _) becomes printf__FPCce. 

To shorten encodings repeated types in an argument list are not repeated in full; rather, a reference to 
the first occurrence of the type in the argument list is used. For example: 
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f (complex, complex); // f_F7ccuplexTl 

// the second argument is of the 
// type as argument 1 

f (record, record, record, record); // f_F6recordN31 

// the 3 arguments 2, 3, and 4 are of 
// the same type as argument 1 

A slightly different encoding is used on systems without case distinction in linker names. On systems 
where the linker imposes a restriction on the length of identifiers, the last two characters in the longest 
legal name are replaced with a hash code for the remaining characters. For example, if a 45 character 
name is gener a ted on a system with a 31 character limit, the last 16 characters are replaced by a 2 char¬ 
acter hash code yielding a 31 character name. 

Naturally, the encoding of signatures into identifier of limited length cannot be perfect since informa¬ 
tion is destroyed. However, experience shows that even truncation at 31 characters for the old and 
less dense encoding was sufficient to generate distinct names in real programs. Furthermore, one can 
often rely on the linker to detect accidental name clashes caused by the hash coding. The chance of an 
undetected error is orders of magnitude less than the occurrence of known problems such as C pro¬ 
grammers accidentally choosing identical names for different objects in such a way that die problem 
isn't detected by foe compiler or the linker. 
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Footnotes 


i. 


Naturally, this would be the same function as was used to write the linker output filter, 
examples here are based on die name decoding routine written by Steve Brandt and used 
modify the UNIX System V C debugger sdb into sdb++. 


The 

to 
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NOTE 


This chapter is taken directly from a paper by Phil Brown. 


Introduction 

One feature of C++ is the provision for function and data protection through a combination of the fol¬ 
lowing: 

■ public, protected, and private class members 

Every Hass member has an associated level of protection, public indicates no protection, 
whereas private indicates access is limited to members and Mends, protected is similar to 
private except that it allows access additionally to derived classes. 

■ inheritance 

Derived riassps are defined in terms of base classes. Inheritance is the name and description of 
this process, by which a derived class acquires the data and functions of its base classes. As pre¬ 
viously noted, the private members of the base classes are not accessible in the derived class. 

The protection of other members is dependent on the type of the derivation, public and pro¬ 
tected members of public base classes will have the same protection in the derived class. These 
same members from a private base class will be private in die derived class. (See Figure 7-1) 

■ friendship 

Friendship overrides all protections within a class. A friend declaration within a class denotes 
another class 1 or function as a potential friend. 

The following access rules define when a potential friend will be considered a friend. 

This papier defines the C++ access rules, as they relate to the various protection methods, and explains 
some of the reasoning for these rules. 


Access Rules 

1. Any visible non-"dass member" is accessible. 

2. If an object is accessible, then 

a. public members of the object's class type are accessible. 

b. potential friends of the object's class type will be considered friends. 

c. The same level of access applies to the public base classes of the object's class type. 
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AU members of a cl*«, and public and protected members of its base classes, are accessible 
by member and friend functions of the class. 2 


Explanation 

1. Any visible non-"doss member'' is a ccess ibl e. 

The first of the access rules is the starting point for many references. In the following: 

int i; 

void 
f() { 

i - 1; //OK - Rule #1 

> 

the variable i is accessible since it is not a dass member and is visible in the function f. 

2. If an object is accessible, then 

a. public members of the object's class type an accessible. 

The first pa rt of the second rule is a restatement of its condition. Access to public 
members of a class object is the minimal amount of accessibility (excluding no access). 


class B { 
public: 

int i; 

1 ; 


void 
fO ( 

B b; 

b.i - 1; //OK - Rule #1, #2a 


In this case, the variable b is accessible by Rule #1. Since b is accessible, the public 
member i of B will be accessible (Rule #2a). 

b. potential friends of the object's doss type wHl be considered friends. 

One way to view this is to consider a friend declaration as a public member which 
will not be honored unless that friend declaration is accessible. Once friendship has 
been established, access is described by Rule #3. 
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class B { 

private: // unnecessary 

int i; 

friend void f(); 


class D : private B { 

}; 

void 
f« ( 

B b; 

b.i - 1; //OK - Rule #1, #2b, #3 
0 d; 

d.i - 1; // ERROR - Rule #1, #2a, -fail- 


In this example, both variables b and d are accessible according to Rule #1. However, 
in the first case, the function f is a friend of class B since, by Rule #2b, b is 
and class B has a friend declaration for the function f. Rule #3 states that, as a friend, 
f will have access to all of the members of class B. The assignment to b.i is thus 
valid. In the second case, the public members of d are accessible according to Rule 
#2a. Since function f is not a friend of class D, and class B is not a public base Hacc 
of class D, there are no other access rules to apply. The assignment to cLi is invalid. 

c. the same level of access applies to the public base classes of the object's class type. 

This rule applies when Rule #3 cannot (access is not by a member or friend). Notice 
that there will be no access to private base classes. 

class B { 
public: 
int i; 

1 ; 


class D : public B { 

private: // unnecessary 

int j; 

1 ; 

void 
f() ( 

D d; 

d.i - 1; // OK - Rule #1, #2a, #2c 

d. j - 1; // ERROR - Rule #1, #2a, -fail- 

) 

In this example, the variable d is accessible according to Rule #1. According to Rule 
#2a, the public members of class D are thus accessible. Since j is a private member of 
class D, it will not be accessible. However, by Rule #2c, since class B is a public base 
class of class D, the public members of class B will also be accessible. The assign¬ 
ment to d.i is valid. 
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3. All members of a class, and public and protected members of its base classes , are ac cess ible by 
member and friend functions of the class, (self-explanatory) 

The reasoning for the rules as they apply to inheritance is illustrated by Figure 7-1. 


Figure 7-1: Derivation Relationship 


BASE: DERIVED: 



This diagram shows the level of protection of a base class member when referenced through a derived 
class. As indicated in Rule #3, since friends and members of the derived class have access to all 
members of the derived dass, they will also have access to the public and protected members of any 
base dass. 

When neither a friend nor member of the derived dass, access to base dass members will be deter¬ 
mined by the type of derivation. If it is a private derivation, the base class members will be private in 
the derived dass. As such, the base dass members will not be accessible. However, in a public 
derivation, the same level of access will apply for base class members as applies within the derived 
dass. 
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public and protected base member declarations in a derived class (of the form base classnmember) 
can be used to alter the accessibility of class members. When given in a private derived class a base 
member declaration will make the designated base member appear to be a member of the derived 
class. Thus, accessibility of the member will be deteimined at the level of the derived dass 

A superfluous base member declaration (i.e., one given in a public derived dass) is ignored. This is 

necessary since an inaccessible base member declaration can conceivably hide a validly a ccessibl e base 
member. 

class A ( 
protected: 

int i; 
friend f(); 

); 

class B : public A { 
protected: 

A: :i; 

1; 

void f () { 

B* p; 

P ->i ” 1; // This would be illegal if the base 

// member declaration was not ignored 
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Examples (Not Interdependent) 

//- start of exanple 01 — 

class B { 

int i; 

friend void f() ; 

}; 


class D : public B { 

}; 

void 

f() { 

B* p » new B; 

D* q - new D; 

int fil - p->i; // OK - Rule #1, #2b, #3 

int fi2 - q->i; // OK - Rule #1, #2a, #2c, #2b, #3 


//- start of example 02 

class B { 

int i; 

}; 


class 0 : public B { 

}; 

void 
f() { 

B* p « new B; 

D* q « new D; 

int fil - p->i; // ERR£» - Rule #1, *2a, -fail- 

int fi2 - q->i; // ERROR - Rule #1, #2a, #2c, -fail- 

} 

//-start of example 03 - 

class B { 

int i; 
friend C; 

I; 

class C : private B { 
friend De¬ 
void fl() { 

int fil - i; // OK - Rule #3, #2b, #3 
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}; 

class D : public C { 
void f2 () { 

int fi2 - i; // ERROR - Rule #3, #2b, #3, -fail- 

} 

}; 

//- start of example 04 - 

class B { 

int i; 
friend D; 

); 

class C : private B { 

}; 

class D : public C { 
void f () { 

int fil - i; // ERROR - Rule #3, -fail- 

) 

); 

//- start of exanple 05 - 

class B { 

int i; 
friend D; 

}; 

class C : public B { 

>; 

class D : private C { 
void f () { 

int fil - i; //CMC - Rule #3, #2c, #2b, #3 

I 

); 

//- start of example 06 - 

class B { 

int i; 
friend 0; 

); 

class D { 

void f () { 

B* p « new B; 

int fil - p->i; //CMC - Rule #1, #2b, #3 

) 
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}; 


//- start of example 07 

class B { 
protected: 

int a; 

); 


class D : public B { 

friend void f (); 


public: 


}; 


int b; 


void 
f() { 

D* p; 
p—>a — 1; 
p—>b — 2; 

B* pp; 
pp->a - 1; 

pp—>b * 1; 


pp - p; 
pp->a - 1; 
pp->b “ 2; 


// OK - Rule #1, #2b, #3 
//CMC - Rule #1, #2a 


// ERROR - Rule #1, #2a, -fail- 
// ERROR - Rule #1, #2a, -fail- 


// ERROR - Rule #1, #2a, -fail- 
// ERROR - Rule #1, #2a, -fail- 


//- start of exanple 08 


class A { 
protected: 

int a; 

}; 


class B : public A { 

); 

class C : public B { 

void f(B* p); 

); 

void 

C::f(B* p) { 

a - 1; //CMC - Rule #3, #2c 

p->a - 2; // ERROR - Rule #1, #2a, #2c, -fail- 


//- start of example 09 
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class A { 

int a; 

friend void f(); 

); 


class B : public A { 

}; 

void 
f{) { 

B* p; 
p->a - 1; 

A* p2; 
p2->a « 2; 

> 

//- start of exanple 10 


class B { 

friend void fl () ; 

public: 

int a; 


); 


//OK - Rule #1, #2a, #2c, #2b, #3 

//OK - Rule #1, #2b, #3 


class C : private B { 

friend void f2() ; 

>; 


class D : public C { 

}; 

void 
fl() { 

D* pi; 

pl->a - 1; // ERROR - Rule #1, #2a, #2c, -fail- 

} 

void 
f2() { 

D* p2; 

p2->a - 1; //OK - Rule #1, #2a, #2c, #2b, #3 

) 

//- start of exanple 11 - 

class B { 

friend void fl(); 
int a; 

); 


class C : private B { 
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); 


friend void f2(); 


class 0 : public C { 

); 

void 
fl() { 

D* pi; 

pl->a - 1; // ERROR - Rule #1, #2a, #2c, -fail- 

} 

void 
f2() { 

D* p2; 

p2->a - 1; // ERROR - Rule #1, #2a, #2c, #2b, #3, -fail- 

) 


//- start of example 12 


class B { 

friend void fl(); 


public: 


}; 


int a; 


class C : public B { 

friend void f2 (); 

}; 

class D : public C { 

}; 

void 
fl 0 { 

D* pi; 

pl->a - 1; //OK - Rule #1, #2a, #2c, #2c 

} 

void 
f2() { 

D* p2; 

p2->a - 1; // OR - Rule #1, #2a, #2c, #2b, #3 

) 

//- start of example 13 - 

class B { 

friend void fl(); 
int a; 

>; 
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class C : public B { 

friend void f2(); 

); 


class D : public C { 

}; 

void 
fl() { 

D* pi; 

pl->a - 1; //OK - Rule #1, #2a, #2c, #2c, #2b, #3 


void 
f2() { 

D* p2; 

p2->a - 1; // ERROR - Rule #1, #2a, #2c, #2b, #3, -fail- 
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Footnotes 


1. Denoting a class as a friend, in effect, denotes each function member of that class as a friend. 

2. Rules #2b and #3 can be combined to override Rule #2c. 

3. A public base member declaration must appear in a public section of the derived class. Similar 
logic applies to the protected c ase. 
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NAME 

CC - C++ translator 
SYNOPSIS 

CC [-E] l-F|-Fc) [-suffix] [+i] |+L] [+x file] [+eO|+el] I+d] l+wj [+pl [+aO|+al] file ... 
DESCRIPTION 

CC (capital CC) translates C++ source code to C source code. The command uses cpp(l) for 
preprocessing, cfront for syntax and type checking, and cc(l) for code generation. 

For each C++ source file, CC creates a temporary file in /usr/tmp, file x, containing fire gen¬ 
erated C file for compilation with cc. The +i or -suffix options will save a copy of this file in 
the current directory, with he name file~c or fdesuffix. 

CC takes arguments ending in x, .C or a to be C++ source programs, a files are presumed to 
be the output of cpp(l). Boh s and a> files are also accepted by the CC command and passed 
to cdl). 

CC interprets he following options: 

-E Run only cpp on he C++ source files and send he result to standard output 

-F Run only cpp and cfront on he C++ source files, and send the result to standard out¬ 

put 

-Fc like he -F option, but he output is C source code suitable as a .c file for cdl). 

-jiffiz Instead of using standard output for the -E , -F or -Fc options, place he output 

from each x file on a file with the corresponding suffix. 

+i Produce intermediate -c C language file in he current directory. 

+L Generate source line number information using he format "#line %d" instead of "# 

%d". 

+xfU e Read a file of sizes and alignments. Each line contains three fields: a type name, the 
size (in bytes), and he alignment (in bytes). This option is useful for cross compila¬ 
tions and for porting he translator. See the AT&T C++ Language System Release 2D 
Release Notes for more information. 

+e[01] Optimize a prog r a m to use less space by ensuring that only one virtual table is gen¬ 
erated per class. +el causes virtual tables to be external and defined, hat is, initial¬ 
ized. +e0 causes virtual tables to be external but only declared, that is, uninitialized. 
When neither option is used, virtual tables will be static, hat is, here will be one per 
file. Usually, +el is used to compile one file that indudes dass definitions, while +e0 
is used on all the other files induding those dass definitions. 

+d Do not expand inline functions. 

+w Warn about all questionable constructs. Without the +w option, he translator issues 
warnings only about constructs hat are almost certainly problems. 

+p Disallow all anachronistic constructs. Ordinarily he translator warns about 
anachronistic constructs; under +p (for "pure"), he translator will not compile code 
containing anachronistic constructs, such as "assignment to this." See the AT&T C++ 
Language System Product Reference Manual for a list of anachronisms. 

+a[01] The translator can generate either ANSI C style or "Classic C" (also known as K&R 
O style declarations. The +a option specifies which style of declarations to produce. 
+a0, he default, causes the translator to produce "Classic C" style declarations. The 
+al option causes the translator to produce ANSI C style declarations. 

See ld{ 1) for loader options, «s(l) for assembler options, cc(l) for code generation options, and 
cpp(l) for preprocessor options. 
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FILES 

Most of the default pathnames listed below can be modified by changing environmental vari¬ 
ables in CC. 


file.[Cc] 

file..c 

file.o 

a.out 

/lib/cpp 

cfront 

/bin/cc 

/lib/libc.a 

/lib/libCa 
/lib/libtask.a 
/lib/libcomplexa 
/ usr/include/CC 


input file 

optional cfront output 
object file 
linked output 

C preprocessor 
Cfront end 
C compiler 

standard C library; see Section (3) in the UNIX System V 

Programmer Reference Manual 

standard C++ library 

C++ real-time library 

C++ complex library 

standard directory for #include files 


SEE ALSO 

cc(l), monitori3), profil), ld(l), cpp(l), as(l). 

Bjame Stroustrup, The C++ Programming Language, Addison-Wesley 1986. 

B. W. Kemighan and D. M. Ritchie, The C Pr og ra mmi ng Language, Prentice-Hall 1978. 


DIAGNOSTICS 

The diagnostics produced by CC itself are intended to be self-explanatory. Occasional mes¬ 
sages may be produced by the assembler or loader. No messages should be produced by cc(l). 

BUGS 

Some "used before set" warnings are wrong. 
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NAME 

C++filt - C++ name demangier 
SYNOPSIS 

c++filt t-m] [-si [-v] 

DESCRIPTION 

C++filt copies standard input to standard output after decoding tokens which look like C++ 

encoded symbols. Any combination of tee following options may be used: 

-m Produce a symbol map on standard output This map contains a list of the 

encoded names encountered and tee corresponding decoded names. This out¬ 
put follows tee filtered output 

-s Produce a side-by-side decoding with each encoded symbol encountered in 

tee input stream replaced by the decoded name followed by the original 
encoded name. 

-v Output a message giving information about tee version of c++/3f being used. 

SEE ALSO 

CC(1), ld(l), nm(l). 

Bjame Stroustrup, The C++ Programming Language. Addison-Wesley 1986. 
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NAME 

elf demangle - decode a C++ encoded symbol name 
SYNOPSIS 

char *elf_demangle (char const ‘symbol) 

DESCRIPTION 

demangle decodes an encoded C++ symbol name into a format which more closely resembles 
the original C++ declaration. This routine should be used to convert symbols obtained horn 
an ELF symbol table into a form more suitable for output 

WARNING 

This routine allocates space for foe return buffer using the ELF allocation routines. 

CAVEAT 

The return value points to static data whose content is overwritten by each call. 

SEE ALSO 

CC(1), c++filt(l), libelf(3), nm(l). 

Bjame Stroustrup, The C++ Programming Language, Addison-Wesley 1986. 

DIAGNOSTICS 

The argument, symbol , will be returned if it points to a string which does not need decoding. 
A return value of NULL indicates that storage could not be allocated for the return butter. 
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APPLE COMPUTER, INC SOFTWARE LICENSE 

PLEASE READ THIS LICENSE CAREFULLY BEFORE USING THE 
SOFTWARE BY USING THE SOFTWARE, YOU ARE AGREEING 
TO BE BOUND BY THE TERMS OF THIS LICENSE IF YOU DO 
NOT AGREE TO THE TERMS OF THIS LICENSE, PROMPTLY 
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OBTAINED IT AND YOUR MONEY WILL BE REFUNDED. 

1. License. The application, demonstration, system and other software 
accompanying this License, whether on disk, in read only memory, or on 
any other media (the “Apple Software”) and related documentation are 
licensed to you by Apple. You own the disk on which the Apple Software 
is recorded but Apple and/or Apple's Licensors) retain title to the Apple 
Software and related documentation. This License allows you to use the 
Apple Software on a single Apple computer and make one copy of the 
Apple Software in machine-readable form for backup purposes only. You 
must reproduce on such copy the Apple copyright notice and any other 
proprietary legends that were on the original copy of the Apple Software. 
You may also transfer all your license rights in the Apple Software, the 
backup copy of the Apple Software, the related documentation and a copy 
of this License to another party, provided the other party reads and agrees 
to accept the terms and conditions of this License. 

2. Restrictions. The Apple Software contains copyrighted material, 
trade secrets and other proprietary material and in order to protea them 
you may not decompile, reverse engineer, disassemble or otherwise 
reduce the Apple Software to a human-perceivable form. You may not 
modify, network, rent, lease, loan, distribute or create derivative works 
based upon the Apple Software in whole or in part You may not 
electronically transmit the Apple Software from one computer to another 
or over a network. 

3. Support. You acknowledge and agree that Apple may not offer 
any technical support in the use of the Software. 

4. Termination. This License is effective until terminated. You may 
terminate this License at any time by destroying the Apple Software and 
related documentation and all copies thereof. This License will terminate 
immediately without notice from Apple if you fail to comply with any 
provision of this License. Upon termination you must destroy the Apple 
Software and related documentation and all copies thereof. 

5. Export Law Assurances. You agree and certify that neither the 
Apple Software nor any other technical data received from Apple, nor the 
direa product thereof, will be exported outside the United States except as 
authorized and as permitted by the laws and regulations of the United 
States. 

6. Government End Users. If you are acquiring the Apple Software 
on behalf of any unit or agency of the United States Government, the 
following provisions apply. The Government agrees: 

(i) if the Apple Software is supplied to the Department of Defense 
(DoD), the Apple Software is classified as “Commercial Computer 
Software” and the Government is acquiring only “restriaed rights” in the 
Apple Software and its documentation as that term is defined in Clause 
252.227-7013(cXl) of the DFARS; and 

(ii) if the Apple Software is supplied to any unit or agency of the 
United States Government other than DoD, the Government’s rights in the 
Apple Software and its documentation will be as defined in Clause 52.227- 
19(c)(2) of the FAR or, in the case of NASA, in Clause 18-52.227-86(d) of 
the NASA Supplement to the FAR. 

7. Limited Warranty on Media. Apple warrants the disks on which the 
Apple Software is recorded to be free from defects in materials and 
workmanship under normal use for a period of ninety (90) days from the 
date of purchase as evidenced by a copy of the receipt. Apple’s entire 
liability and your exclusive remedy will be replacement of the disk not 


meeting Apple's limited warranty and which is returned to Apple or an 
Apple authorized representative with a copy of the receipt. Apple will 
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misapplication. ANY IMPLIED WARRANTIES ON THE DISKS, INCLUDING 
THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR 
A PARTICULAR PURPOSE, ARE LIMITED IN DURATION TO NINETY (90) 
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8. Disclaimer of Warranty on Apple Software. You expressly 
acknowledge and agree that use of the Apple Software is at your sole risk. 
The Apple Software and related documentation are provided “AS IS” and 
without warranty of any kind and Apple and Apple's Licensors) (for the 
purposes of provisions 8 and 9, Apple and Apple's Licensor© shall be 
collectively referred to as "Apple") EXPRESSLY DISCLAIM ALL 
WARRANTIES, EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, 
THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR 
A PARTICULAR PURPOSE. APPLE DOES NOT WARRANT THAT THE 
FUNCTIONS CONTAINED IN THE APPLE SOFTWARE WILL MEET YOUR 
REQUIREMENTS, OR THAT THE OPERATION OF THE APPLE SOFTWARE 
WILL BE UNINTERRUPTED OR ERROR-FREE, OR THAT DEFECTS IN THE 
APPLE SOFTWARE WILL BE CORRECTED. FURTHERMORE, APPLE 
DOES NOT WARRANT OR MAKE ANY REPRESENTATIONS REGARDING 
THE USE OR THE RESULTS OF THE USE OF THE APPLE SOFTWARE OR 
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INFORMATION OR ADVICE GIVEN BY APPLE OR AN APPLE 
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APPLE AUTHORIZED REPRESENTATIVE) ASSUME THE ENTIRE COST OF 
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DO NOT ALLOW THE EXCLUSION OF IMPLIED WARRANTIES, SO THE 
ABOVE EXCLUSION MAY NOT APPLY TO YOU. 

9. Limitation of liability. UNDER NO CIRCUMSTANCES INCLUDING 
NEGLIGENCE, SHALL APPLE BE LIABLE FOR ANY INCIDENTAL, 

SPECIAL OR CONSEQUENTIAL DAMAGES THAT RESULT FROM THE USE 
OR INABILITY TO USE THE APPLE SOFTWARE OR RELATED 
DOCUMENTATION, EVEN IF APPLE OR AN APPLE AUTHORIZED 
REPRESENTATIVE HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH 
DAMAGES. SOME STATES DO NOT ALLOW THE LIMITATION OR 
EXCLUSION OF LIABILITY FOR INCIDENTAL OR CONSEQUENTIAL 
DAMAGES SO THE ABOVE LIMITATION OR EXCLUSION MAY NOT 
APPLY TO YOU. 

In no event shall Apple's total liability to you for all damages, losses, and 
causes of action (whether in contraa, tort (including negligence) or 
otherwise) exceed the amount paid by you for the Apple Software. 

10. Controlling Law and Severability. This License shall be governed 
by and construed in accordance with the laws of the United States and the 
State of California, as applied to agreements entered into and to be 
performed entirely within California between California residents. If for 
any reason a court of competent jurisdiction finds any provision of this 
License, or portion thereof, to be unenforceable, that provision of the 
License shall be enforced to the maximum extent permissible so as to effea 
the intent of the parties, and the remainder of this License shall continue in 
full force and effect 

11. Complete Agreement. This License constitutes the entire 
agreement between the parties with respect to the use of the Apple 
Software and related documentation, and supersedes all prior or 
contemporaneous understandings or agreements, written or oral, 
regarding such subjea matter. No amendment to or modification of this 
License will be binding unless in writing and signed by a duly authorized 
representative of Apple. 

7/15/91 

001-0158-A 

















